Why CPUs Aren't Getting Any Faster

1053r · on Oct 15, 2010

The reason CPUs aren't getting any faster is only tangentially mentioned in the article. Yes, it is heat dissipation. But why then did they get faster for so many decades?

As the process size drops, you can crank up the clock-speed while leaving the total heat dissipation constant. But the heat density is related to the voltage, resistance, and the amount of time the transistors spend partially on or off (we wish they were perfect switches, but they aren't really). So as you switch faster, unless you can lower the resistance or voltage (which requires changing your materials), you probably spend more and more time in a partially on or off state. This means your heat density rises to the point where you are just south of burning things out.

You make some engineering decision about the reliability you want in your chips, and calculate or test how high a heat density you can tolerate. But unless you change your materials so they use lower voltage, or invent new ways to move heat away faster, or use materials that are more conductive, you aren't upping the heat density or clock rate. But you can still make them smaller and use less power total for the same amount of computation.

This is why reversible computing (gives off less heat), diamond substrates (much higher thermal conductivity), microfluidic channels (moves heat away faster), and parallelism (larger chips = more computation) are being explored. And only the last one is practical THIS year.

Symmetry · on Oct 15, 2010

Another reason we've stopped getting faster is that we've reached a point where velocity saturation in silicon has become an important issue:

http://en.wikipedia.org/wiki/Velocity_saturation

seancron · on Oct 15, 2010

We're quickly reaching the limits of silicon-based chips, but new materials will allow us to make even faster CPUs.

One such material is the material that won the Noble Prize in physics: graphene. Graphene can increase the frequency of an electromagnetic signal, which can allow for even faster CPUs in the 500-1000 Ghz range

Source: http://web.mit.edu/newsoffice/2009/graphene-palacios-0319.ht...

_delirium · on Oct 15, 2010

It's not clear that a whole CPU can actually be made at anything near 500-1000 GHz, though. That's the switching speed of the graphene transistors themselves, but the clock speed of a processor can't really feasibly be made to be equal to the switching speed of the transistors. There are silicon transistors with 50 GHz switching speeds, but the CPUs are still plateauing at around 4-5 GHz.

borisk · on Oct 15, 2010

IMHO the real reason is CPUs and manufacturing technologies have become extremely complex. So although there are known, simple ways to address the heat dissipation it takes a lot of time to implement them.

WillyF · on Oct 15, 2010

Most of the comments seem to focus on the supply side, but I think that demand for faster processors is waning—at least at the consumer level. I used to care what speed my processor was. Now I have to check my system info to remind me what I have in my MacBook Pro. Cloud computing has changed a lot for me as a user.

Building faster processors is extremely expensive, so demand has to be a key concern for manufacturers. I still think there's plenty of demand for faster processors, and I'm sure we'll continue to see lots of innovation, but the issue doesn't seem to be as pressing as it was 10 years ago.

ergo98 · on Oct 15, 2010

How is that a result of cloud computing? Your browser is often the most resource intensive application you use nowadays.

CPUs have passed a curve where there just aren't any killer applications demanding more CPU power. For me the biggest "wish I had a faster CPU" need has been video encoding, however recently the encoder I use switched to using the GPU for encoding and the speed improved dramatically, so even that need has declined.

pbw · on Oct 15, 2010

This wasn't a great article, but the topic fascinates me. I'm surprised the shift from faster-clocks to multi-core went so smoothly. No one seems to really mind. I really like my quad core, 4 processors is a lot nicer than one.

But I wonder about hundreds or thousands of cores, if we'll see that and if people will start to worry that single threaded software uses ever smaller amounts of their shiny new hardware. Will there every be some magic layer that can run single threaded software on many cores?

I wrote about the end of faster clocks and start of multi-core recently: http://www.kmeme.com/2010/09/clock-speed-wall.html

jdavid · on Oct 16, 2010

I have read that linux will start to bottleneck at 48 cores. damn i wish i knew where that article was.

pbw · on Oct 17, 2010

http://news.ycombinator.com/item?id=1744305

falsestprophet · on Oct 16, 2010

Linux is a lot easier to fix than physics.

ramchip · on Oct 15, 2010

As CPUs have become more capable, their energy consumption and heat production has grown rapidly. It's a problem so tenacious that chip manufacturers have been forced to create "systems on a chip"--conurbations of smaller, specialized processors.

I don't think that's a very good explanation of SoCs.

Brashman · on Oct 15, 2010

Agreed. SoCs are also typically built for specific applications, different than the general purpose CPUs that article is mostly focusing on.

I would also say that forced is a bit strong. SoCs are a good solution when (1) you have a very specific task and (2) can justify/afford the cost for a specialized chip.

alain94040 · on Oct 15, 2010

I believe the main reason is CPU micro-architecture (note: I have had way too much exposure in my life to the design of the CPU in your phone and in your laptop to have a non-biased opinion)

What does that mean? Essentially that the race for deep pipelines has ended, with 20-40 stages being the optimal depth. After that, miss penalties just hurt too much. Therefore, when you can't make the pipeline deeper, you can't make the frequency much faster, you are stuck with following process progress (which is already pretty good). So it's more tempting to go after multi-cores: same pipeline depth, more silicium, more efficient overall.

edparcell · on Oct 15, 2010

I think that one approach that may yield domain-specific improvements would be to add certain numerical routines into the x86 instruction set.

When I was working in finance as a quant, I was shocked by the amount of time code spent executing the exponential function - it is used heavily in discount curves and similar which are the building blocks of much of financial mathematics. An efficient silicon implementation would have yielded a great improvement in speed.

jerf · on Oct 15, 2010

CRC32 instructions: http://www.strchr.com/crc32_popcnt

String processing instructions in SSE4.2: http://www.strchr.com/strcmp_and_strlen_using_sse_4.2

AES encryption instructions: http://en.wikipedia.org/wiki/AES_instruction_set

So, if you didn't know about those, give yourself a point, because you nailed it. (No sarcasm.) There's a definite trend there.

_delirium · on Oct 15, 2010

It can definitely help in certain domains, but adding special-case instructions in silicon can sometimes complicate a chip design enough that it slows it down overall. The trend for a while was in the other direction, towards not implementing in silicon things that were even already in the x86 instruction set, like the transcendental arithmetic functions, and doing them in microcode instead (the "RISCification" of x86 processors). It's possible that trend is now reversing, though.

jdavid · on Oct 15, 2010

I have a few reasons chip speeds have stalled

  - Intel has been at the top for too long.
  - x86 is to complex
  - nVidia and AMD are being blocked from making x86 chips
  - PCs pretty much require x86 to exist.

moving to graphine might allow for an increase in chip temp, but do you really want a processor running at a few hundred to a thousand degrees? are you willing to pump 200-2k watts into a chip?

  thank god that we didn't make this mistake with mobile
  and most platforms use an abstraction level language
  like c#, java, or javascript.

I am personally hoping that data centers really are evaluating ARM chips. The instruction set is smaller, they are lower power, and there are more producers of ARM cores, so prices are much lower. A good intel chip will cost $200-$500, while a chip with an ARM core is probably in the $25-$100 range.

  How much does a tegra2, A4, or snapdragon cost?

I imagine the future of data centers will be arrays of system on chip ARM cores paired with high doses of flash memory. Running your web app off of 1,000 ARM cores might cost you a few thousand a month.

Locke1689 · on Oct 16, 2010

x86 is to[o] complex

If you're going to criticize CISC architectures I would have done it before Intel moved to optimized RISC microcode in implementation. Arguably the greatest advancement in CPU technologies in the past couple years was the addition of hardware virtualization extensions to fulfill the Goldberg requirements. That said, I am biased because I wrote the Intel hardware virtualization layer for an HPC VMM.

gaylordzach · on Oct 16, 2010

Interesting thought to revert to many inexpensive cores as opposed to virtualising on "super-computers". However I don't think the equation stacks up. ARM is still a long way from offering high processing cores (e.g. no 64Bit). Their whole concept ties perfectly in the mobile world and it's no surprise they are so successful.

jdavid · on Oct 16, 2010

As far as I know CUDA is 64bit, and you should be able to run CUDA on tegra2.

So, for ARM the only reason to have 64bit is to address large memory, and drive controllers for flash can address the rest. heck if someone really wanted to do an integrated server environment they could put ram behind the drive controller.

So having a 32bit core and a 64bit gpu on chip means you can do complex scientific math, but how many web apps really need 64bit floats, ints? In a web world if you needed 64bit, just make a service on a platform that works and either proxy the request or redirect to it.

i guess mongodb would suck on a 32bit arm, but that's just because they have a lazy memory manager model. i am sure if arm got big or if someone paid 10gen to fix it, they would.

I bet most of the web could work on a large array of high frequency ARM chips.

I know that both google is working on it, and I have seen articles on facebook doing it.

It's definitely experimental right now, but I think in the next 18 months you will see more public experiments.

Mobile chips also have better power management than x86, and can go in to deep sleep and maintain cache really well.

Watch this, i know it's going to POP.

there is also evidence that nVidia is figuring out it's legal strategy for this. which has an ARM license, and bought transmeta.

smackfu · on Oct 15, 2010

Back when I was in college for CE, one of my professors was very concerned that testing CPUs would eventually be the bottleneck. That verifying that it was actually working correctly would be such a burden once the number of transistors reached a high enough level.

Of course, I never heard about this again. Ring a bell with anyone?

alain94040 · on Oct 15, 2010

It's called functional verification. Some say it's 70% of a project. I think that number is exaggerated, but maybe 50% is more realistic.

My previous startup (http://eve-usa.com) sells million-dollar boxes that are essentially debuggers, just like gdb is for software, but for chips. Very cool.

ctkrohn · on Oct 15, 2010

This is probably a stupid question, but if heat dissipation is a big problem, why can't we just build better cooling systems: bigger heatsinks, refrigeration, etc.? I'm not an electrical engineer, so I'm sure there's something I'm missing.

alan · on Oct 15, 2010

At this point it's getting heat out of the CPU to the heatsink that's the problem. How close the various transistors are makes getting heat around them and out of the CPU.

ctkrohn · on Oct 15, 2010

Ah, ok. Thanks for the explanation.

wmf · on Oct 15, 2010

For given system cost, any marginal dollar spent on cooling requires a dollar less on silicon. Also, cooling isn't scalable so there are seriously diminishing returns above ~130W.

jdavid · on Oct 16, 2010

heat is a huge energy loss, server rooms require AC and at some point you cant add more AC.

I wonder if this heat could actually get high enough to recoup some energy.

olegkikin · on Oct 15, 2010

CPUs are getting faster, even if they have the same clock speed:

http://www.cpubenchmark.net/high_end_cpus.html

Brashman · on Oct 15, 2010

The article seems to imply that optimizing for power means Intel isn't innovating in CPUs. Optimizations in power allow the chip to be clocked faster (or perform more in parallel) leading to overall performance improvements. These optimizations are improvements in CPUs.

bherms · on Oct 15, 2010

Also, we're flirting with the limits of Moore's "law" here. I did a report back in high school on it and speculated you'd never really see processors over 4GHz. I guess I was right.

As you start to shrink transistors and the spacing between them, the chips get hotter, burn more power, and throw more errors. You also get electron "leakage" where the electrons inadvertently jump between gates, so the processors become less efficient and you have to run extra fault-tolerance to check for the errors.

Multi-core and bringing all the other components up to speed is the way to go for now until a newer technology comes, like quantum computing or light based processing.

notyourwork · on Oct 15, 2010

You can overclock a processor above 4GHz with a bit of liquid nitrogen and some bios changes. ;-)

borisk · on Oct 15, 2010

"Cedar Mill" CPUs would go up to 8GHz:

http://www.nordichardware.com/index.php?option=com_content&#...

zandorg · on Oct 15, 2010

I want one of those.

mrb · on Oct 15, 2010

"you'd never really see processors over 4GHz."

IBM released a 5GHz POWER processor 2.5 years ago: http://www.theregister.co.uk/2008/04/08/ibm_595_water/

bherms · on Oct 15, 2010

And how'd that work out? Not very well. Standard top of the line is 3.2-3.4.

wipt · on Oct 16, 2010

It obviously worked out fine. IBM isn't catering towards the consumer market with the POWER line.

ergo98 · on Oct 15, 2010

>I did a report back in high school on it and speculated you'd never really see processors over 4GHz. I guess I was right.

I'm not sure whether you're being serious or if you're actually claiming some sort of advanced insight here.

There has been speculation on the peak speed of processors for many, many years. I don't think they were consulting your high school reports.