Collectively dubbed Intel NetBurst microarchitecture, the new elements are meant to push more low-level programming instructions through the chip's electronic pipelines. This avoids the bottlenecks and other restraints that will likely cap the clock speeds achievable with the PIII's older design at around 1.3 GHz.
Kevin Krewell, senior analyst at MicroDesign Resources, says NetBurst's design is meant to speed up apps that send data in bursts, such as streaming media, MP3 playback, and video compression. "The design is a change in emphasis from typical integer performance [as in standard business apps] to media performance," he says.
Transistors doubled for speed
Intel uses what it calls Hyper Pipelined Technology in the P4, which doubles the depth of transistors that process the electronic on/off bits making up the CPU's program instructions, so more instructions can be worked on at a time. Another feature, Advanced Dynamic Execution, keeps up to three times more instructions in the pipeline than the PIII can and makes educated guesses about which instruction branch to process. ADE acts like a chef's helper who doesn't know the exact recipe but tries to assist by fetching anticipated items. But speculative execution can hinder the chip if the system guesses wrong, says Nathan Brookwood, principal analyst at Insight 64. Such branch mispredictions are more common with office apps, he says.
Intel increased the speed of the system bus to 400 MHz, compared to 133 MHz in the fastest PIIIs. To ensure that instructions don't back up while travelling from memory to CPU, the new Execution Trace Cache holds already decoded instructions while others finish. A so-called double-pumped arithmetic logic unit runs twice as fast as the rest of the CPU, quickly calculating integer (whole-number) math. It, along with the higher clock speed, ought to speed up all applications, including word processors and spreadsheets.
Speedier web service
Finally, the P4 introduces Streaming SIMD Extensions 2 (SSE2) -- 144 new multimedia and graphics instructions designed to speed up various applications. For example, some SSE2 instructions improve processing of speech, video, image editing, and encryption -- all the rich Web content Intel says the P4 is born to run. However, until programmers write software to take advantage of these instructions, you won't see performance improvements. Intel touts a short but impressive list of upcoming SSE2 programs, including Shiny Entertainment's Sacrifice game, Dragon Systems' Dragon NaturallySpeaking, and Macromedia's Dreamweaver. More important, Intel says Microsoft will add SSE2 support to its DirectX 8 driver, the lingua franca of Windows graphics.
Analysts are reserving judgment on P4's performance vis-à-vis its current rival, AMD's 1.2-GHz Athlon paired with DDR SDRAM.
"They've come up with an architecture that clearly allows them to scale the clock rate," says Brookwood. "The challenge for Intel is to demonstrate that the clock-rate superiority translates into performance superiority."