Closeted genius of x86

Media extensions to the x86 practically create a CPU within a CPU

The outlandish requirements of gaming and media applications have not only changed the way PCs are configured, it has also driven an expansion of the x86 instruction set and on-chip registers that practically creates a CPU within a CPU (or a core within a core).

The same technology that game developers use to make virtual basketballs bounce like the real thing also does a bang-up job of cryptography and compression. The media extensions to the x86 instruction set could help push business analytics toward real-time. Searches and transforms on massive in-memory data sets would get a major speed-up from SIMD (Single Instruction, Multiple Data). Despite the fact that every living PC has SIMD extensions by now, I am sure that those extensions aren't used where they could make the greatest difference.

I understand a few of the reasons why. At the root of it, I think, is a tendency on the part of developers to think that if they're not writing games or financial analysis software, they're "not doing math." Every application does math. Developers usually rely on their preferred language's libraries to handle math for them, but the dirty secret of default language libraries is that they often make poor use of the advanced capabilities of modern CPUs.

The origins of SIMD on the x86 tainted its image. SIMD definitely came into being to serve video games and DVD playback; AMD's variety goes by 3DNow. That sounds more like laundry detergent than the potential brainpower behind compute-intensive business applications.

SIMD has also been segregated in Windows developer tools and documentation. Every Windows developer should have a couple of well-chosen volumes of video game programmers' references in their libraries.

Another factor interfering with a broader use of x86 SIMD is that the majority use of dynamic languages. SIMD is strictly native fare. You can build SIMD into math libraries, as Apple and Intel have done, and then link those to dynamic code. But the overhead of wrapping individual SIMD machine language operations as functions practically nulls the performance benefit. The greater the number of SIMD instructions that can be run before surfacing to the high-level language layer or having the OS switch to another context (which can wipe out the contents of registers used for SIMD operations), the better.

I haven't yet encountered development tools that adequately analyze ordinary C or Fortran code for opportunities to optimize their performance with SIMD. Developers have to hunt down those opportunities themselves, but they're plentiful. A contrived example would involve iterating through a huge array of price data, applying a fixed markup or discount. But there are uses for SIMD that have no apparent relation to math. I imagine using it to accelerate memory management or forensic analysis of a partially erased disk.

Problems outside the digital media, gaming, and visualization realms are rarely going to make a developer snap his fingers and say, "Now that's a job for SIMD!" I think that CPU-level SIMD will give way to GPUs (graphics processing units). But to extract the greatest value from the use of GPUs as compute co-processors, developers must first learn to identify patterns of code that lend themselves to SIMD and related streaming arithmetic functionality.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Tom Yager

InfoWorld
Show Comments

Cool Tech

SanDisk MicroSDXC™ for Nintendo® Switch™

Learn more >

Breitling Superocean Heritage Chronographe 44

Learn more >

Toys for Boys

Family Friendly

Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K

Learn more >

Stocking Stuffer

Razer DeathAdder Expert Ergonomic Gaming Mouse

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?