Nvidia chief scientist: CPUs slowed by legacy design

Bill Dally forecasts a time when GPUs, not CPUs, will do most computer work

When it comes to power-efficient computing, CPUs are weighed down by too many legacy features to outperform GPUs (graphics processing units) in executing common tasks in parallel, said the chief scientist for the GPU vendor Nvidia.

CPUs "burn a lot of power" executing tasks that may be unnecessary in today's computing environment, noted Bill Dally, chief scientist and senior vice president of research for Nvidia, during his keynote Wednesday at the Supercomputer 2010 conference in New Orleans..

The GPU "is optimized for throughput," while "the CPU is optimized for low latency, for getting really good thread performance," he said.

Dally pointed to some of the features that most modern CPUs posses that waste energy in their pursuit of low latencies.

"They have branch predictors that predict a branch every cycle whether the program branches or not -- that burns gobs of power. They reorder instructions to hide memory latency. That burns a lot of power. They carry along a [set of] legacy instructions that requires lots of interpretation. That burns a lot of power. They do speculative execution and execute code that they may not need and throw it away. All these things burn a lot of power," he said.

Although the GPU was originally designed for rendering graphics on the screen, vendors such as Nvidia and Advanced Micro Devices are now positioning their GPU cards as general computation engines, at least for workloads that can be broken into multiple parts and run in tandem.

At least some industries are taking note of this idea, notably the world of high performance computing (HPC). Earlier this week, China's newly built Tianhe-1A system topped the latest iteration of the Top 500 List of the world's most powerful supercomputers. That system includes 7,168 Nvidia Tesla M2050 GPUs in addition to its 14,000 CPUs. Nvidia claims that without the GPUs, the system would need almost four times as many CPUs, twice as much floor space and three times as much electricity to operate.

And although Dally focused his remarks on use in HPC, he said that the general idea will permeate the computing world as a whole.

"HPC is, in many ways, an early adopter, because they run into problems sooner because they operate at a larger scale. But this applies completely to consumer applications as well as to server applications," he said, in an interview following the keynote.

Dally said that while not many current applications are written to run in parallel environments, eventually programmers will move to this model. "I think over time, people will convert applications to parallel, and those parallel segments will be well-suited for GPUs," he said. He even predicted that systems will one day be able to boot off the GPU as well as the CPU, though he said he knows of no work in particular to build a GPU-based operating system.

Factoring in energy use is one of Dally's crucial tenants for claiming GPU superiority. He noted that while the next-generation Nvidia GPU architecture, nick-named Fermi, would consume 200 pJs (picojoules) in power for each instruction executed, a CPU consumes 2nJ (nanojoules), or an order-of-magnitude more joules.

This tiny difference will amount to a huge chasm when amplified across large systems. Dally pointed to the U.S. Defense Advanced Research Projects Agency's efforts to fund development of an exascale computer, or a computer that can execute 1 quintillion calculations per second. Such a system built from CPUs alone, he argued, would require a "nuclear power plant built next door" just to operate in terms of energy use.

Not everyone in the HPC community is completely sold on the idea of using GPUs as a substitute for CPUs. One potential problem many point to is that while GPUs may have greater throughput, it is difficult for systems to provide that much data to these processors.

"There is very little amount of memory that is available to each of the GPUs. If you have something really fast, you need to feed it really fast, and if you don't have enough memory to feed that processor, that processor will just sit there and wait," Dave Turek, head of IBM's deep computing division, said last week.

Dally said that this bandwidth problem is not unique to GPUs--CPUs face the same dilemma. "Bandwidth is a big problem for any computing system," he said. He admitted the problem is more acute for GPUs, though. Nvidia's just-released GTX 580 card has a raw bandwidth of 200 gigabytes per second, whereas a "top-of-the-line" CPU has only about 35 gigabytes per second. "Memory systems need to evolve to be more efficient," he said.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags supercomputersserversprocessorshardware systemsnvidiaComponentsGraphics boardsHigh performanceClustersBlades

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Joab Jackson

IDG News Service
Show Comments

Cool Tech

Toys for Boys

Family Friendly

Stocking Stuffer

SmartLens - Clip on Phone Camera Lens Set of 3

Learn more >

Christmas Gift Guide

Click for more ›

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Michael Hargreaves

Microsoft Office 365/Dell XPS 15 2-in-1

I’d recommend a Dell XPS 15 2-in-1 and the new Windows 10 to anyone who needs to get serious work done (before you kick back on your couch with your favourite Netflix show.)

Maryellen Rose George

Brother PT-P750W

It’s useful for office tasks as well as pragmatic labelling of equipment and storage – just don’t get too excited and label everything in sight!

Cathy Giles

Brother MFC-L8900CDW

The Brother MFC-L8900CDW is an absolute stand out. I struggle to fault it.

Luke Hill


I need power and lots of it. As a Front End Web developer anything less just won’t cut it which is why the MSI GT75 is an outstanding laptop for me. It’s a sleek and futuristic looking, high quality, beast that has a touch of sci-fi flare about it.

Emily Tyson

MSI GE63 Raider

If you’re looking to invest in your next work horse laptop for work or home use, you can’t go wrong with the MSI GE63.

Laura Johnston

MSI GS65 Stealth Thin

If you can afford the price tag, it is well worth the money. It out performs any other laptop I have tried for gaming, and the transportable design and incredible display also make it ideal for work.

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?