Ryzen performance is the mystery that has launched a thousand conspiracy theories. AMD debunked most of them in a blog post Monday night.
For starters, the company said Windows 10’s scheduler isn’t guilty. Internet hardware detectives had started to focus their blame on Windows 10 scheduler, the part of the operating system that doles out work to each individual core or thread in a chip. Many believe Windows 10 scheduler is throwing out work to the wrong cores or threads, hobbling performance.
“We have investigated reports alleging incorrect thread scheduling on the AMD Ryzen processor,” AMD’s Robert Hallock wrote in the blog post. “Based on our findings, AMD believes that the Windows 10 thread scheduler is operating properly for 'Zen,' and we do not presently believe there is an issue with the scheduler adversely utilizing the logical and physical configurations of the architecture.”
Why this matters: Ryzen’s confusing benchmarks have fueled this hot debate. In many multi-threaded tasks, it performs like a bat out of hell and easily matches Intel CPUs that cost twice as much. But when it comes to gaming at standard resolutions of 1080p or at low-quality settings, the performance can lag behind Intel’s newest 7th-gen Kaby Lake CPU, as well as its Broadwell-E chip. Our own tests have shown that at higher resolutions and and higher game settings, the average gamer is unlikely to ever see the difference. And yet the debate rages on.
More conspiracies shot down
The Windows 10 scheduler theory thrived in part because the problem didn’t manifest itself in Windows 7. AMD shot down that notion in the same blog post.
“Finally, we have reviewed the limited available evidence concerning performance deltas between Windows 7 and Windows 10 on the AMD Ryzen CPU,” Hallock wrote. “We do not believe there is an issue with scheduling differences between the two versions of Windows. Any differences in performance can be more likely attributed to software architecture differences between these OSes.”
Next, AMD debunked another popular theory that fingered Windows 10. “As an extension of this investigation, we have also reviewed topology logs generated by the Sysinternals Coreinfo utility," a command-line tool that provides information on how a CPU's processor cores, caches, and other components handle data. "We have determined that an outdated version of the application was responsible for originating the incorrect topology data that has been widely reported in the media," Hallock wrote. "Coreinfo v3.31 (or later) will produce the correct results.”
Another Internet conspiracy theory (which was once proposed by AMD as a fix) is to shut off the CPU’s SMT (symmetrical multi-threading) support for more performance. AMD has walked back that guidance. “We have investigated reports of instances where SMT is producing reduced performance in a handful of games,” Hallock wrote. “Based on our characterization of game workloads, it is our expectation that gaming applications should generally see a neutral/positive benefit from SMT.”
The fix: Optimization
As AMD CEO Lisa Su said in a Reddit AMA just after the chip was released, the answer may come from game optimizations. Hallock echoed Su’s statement again in Monday’s blog post.
“Above all, we would like to thank the community for their efforts to understand the Ryzen processor and reporting their findings,” he said. “The software/hardware relationship is a complex one, with additional layers of nuance when preexisting software is exposed to an all-new architecture. We are already finding many small changes that can improve the Ryzen performance in certain applications, and we are optimistic that these will result in beneficial optimizations for current and future applications.”
Still, one must wonder what’s going on with a chip that exceeds expectations in what you would think are heavy-duty applications, but is slightly disappointing in tasks that are typically thought to be inconsequential for a burly CPU.
Editors at PCPer.com have made some headway on proving another popular theory. The initial 8-core Ryzen is built using two 4-core components called a CCX complex. These CCX units communicate via a high-speed fabric. PCPer.com found that workloads or threads suffered due to higher latency when crossing the CCX complex. In multi-threaded tasks, the latency doesn’t crop up, because the workload is spread across all cores. In lighter loads, the latency may show up because the threads may cross the CCX complex more often.
PCPer believes the problems can be reduced by restricting the workloads of lightly-threaded games or applications to the same CCX complex. In other words, the optimization Su and Hallock highlighted.
PCPer.com tests also largely found that Windows 10's scheduler was functioning properly.