- What is a CPU?
- Tracing an instruction
- L1/L2/L3 Cache
- Clock cycle speed
- Front side bus (FSB)
- The numbers game: Intel vs AMD
- Sockets and slots
- Dual-core and quad-core CPUs
- 64-bit processors
- Mobile Processors
Whenever your CPU has to fetch data, a bottleneck can arise if the data is in a relatively distant location -- say, for example, sitting in external memory that has to be first accessed, scanned and then read. From this problem arose the practice of adding memory directly to the processor itself, creating a small storage area for commonly used data. If the CPU doesn't have to travel outside of itself to get data, it can get on with the business of number crunching much more effectively. This memory is generally referred to as a cache.
Cache within a CPU is split up into levels, which relate normally both to the size of the cache and its access speed. Level 1 (L1) cache is normally small but fast memory, which acts as the first place for the CPU when it goes to look for data. If a desired bit of data is found in the L1 cache, then you'll see a quick result; otherwise the CPU will seek out the Level 2 (L2) cache. L2 caches are normally much larger in size than L1 caches, and there's a solid advantage to this design theory. With a small L1 cache, checking for data is relatively quick; it's only for data outside the L1 cache that you need to look further. A larger L2 cache gives you a sizeable reservoir for data in a slower to scan environment, but because it's not the first place the CPU will check, you get a good trade-off between latency (the time taken to perform an operation) and the success rate of searching the cache itself. As an example, Intel's Core 2 Duo CPUs have a 64KB Level 1 cache, and a 4MB Level 2 cache.
There's also a third level of cache. While historically, L3 cache has been the province of server CPUs, it does appear in some consumer desktop processors -- Intel's P4 Extreme Edition processors (codenamed 'Gallatin'), for example, feature L3 cache. As you can expect L3 cache to be hit least, it can afford to be larger -- in the case of the P4 Extreme Edition it has 2MB of cache memory, four times as much as it has L2 cache (512KB). As the entire cache is situated on the CPU die itself, it's still considerably quicker for the CPU to search the cache levels sequentially than to refer to external storage such as physical memory or hard drives.
A common measure of the complexity of a CPU is how many individual transistors fit into the CPU itself. Getting millions of transistors into a package that's affordable and works properly is a significant challenge, and manufacturers spend billions of dollars on improving the process of laying silicon transistors down. In simple terms, the shorter the spacing in which you can successfully lay down transistors, the more you can pack into a smaller space, and the more chips you can cut out of a single silicon wafer. This has the effect of increasing capacity while lowering overall prices.
Silicon chips are etched using ultraviolet light, and the wavelength of the light -- and effective size of the transistors -- is measured in nanometres, or a billionth of a metre. Current high-end CPUs are manufactured on a 65-nanometre process, and at the time of writing, Intel was just weeks away from shipping CPUs manufactured on a 45-nanometre process. AMD's Athlon 64 X2 CPUs are manufactured on a 90-nanometre process, while AMD's Phenom CPUs are manufactured on a 65-nanometre process. You may also see process measurement referred to in microns -- a micron is a millionth of a metre (where a nanometre is a billionth), so a 90nm process is also a .09 micron one.
On the subject of processes, manufacturers will often use the same basic architecture and micron processes across an entire family of CPU types, and this is commonly referred to as using the same "core". Vendors typically assign these cores with codenames during development, and while they don't tend to go to market with these codenames, you'll often see them used in discussions of processors from both vendors. Intel, for example, has a large quantity of processors built on its Prescott core, all with a similar architectural configuration. Prescott cores are built on a 90nm process, feature 31-stage pipelines and SSE3 instruction sets; even some of the company's dual-core processors are built on a variant of the Prescott core.