7. Many-core chips
The major processor vendors have hit the wall with clock speed. Each new increment in clock ticks draws so much additional power, any true performance boost incurs the wattage and heat output of, say, an electric iron.
So the manufacturers have switched from a single-lane autobahn to multilane highways -- that is, from a single, fast core to multiple slower cores that execute code in parallel. Breakneck speed is no longer the holy grail of computing. Instead, it's total throughput.
Chips with multiple cores consume less power, generate less heat, and complete work very efficiently. On servers, they are exactly what IT likes. Today, for example, an Intel Nehalem processor has four cores, each of which can run two threads simultaneously, so on a quad-processor system -- an inexpensive box -- 32 threads can run simultaneously. Five years ago, only mainframes and very high-end servers could deliver that kind of scalability. Today, it's run-of-the-mill.
[ The benefits of multicore hardware depend on multithreaded software. See Andrew Binstock's Test Center article, "Windows 7 on multicore: How much faster?" ]
Multicore chips have had less impact on desktop computing, due to the lack of applications that can make good use of the parallel resources -- not to mention the lack of programmers skilled in writing multithreaded desktop software. That's changing, however, especially in workstation applications and in graphics apps aimed at power users.
The next decade will see an explosion of cores in new chips. This era, dubbed "many core" -- a term that refers to more than eight cores -- is set to break out shortly. Intel, for example, has already shown working demos of a chip from its Tera-scale project that contains 80 cores and is capable of 1 teraflop using only 62 watts of power. (To put that in perspective, note that a system capable of 18 teraflops would qualify for the current list of the top 500 supercomputers.)
Non-x86 processor vendors are also deeply involved in this fray. For example, Tilera currently sells a 16-core chip and expects to ship a 100-core monster in 2010. What will IT do with so many cores? In the case of Tilera, the chips go into videoconferencing equipment enabling multiple simultaneous video streams at HD quality. In the case of Intel, the many cores enable the company to explore new forms of computing on a single processor, such as doing graphics from within the CPU. On servers, the many-core era will enable huge scalability and provide platforms that can easily run hundreds of virtual machines at full speed.
It's clear the many-core era -- which will surely evolve into the kilo- and megacore epoch -- will enable us to perform large-scale operations with ease and at low cost, while enabling true supercomputing on inexpensive PCs.
-- Andrew Binstock 6. Solid-state drives
SSDs (solid-state drives) have been around since the last century, but recently, we've seen an explosion of new products and a dramatic drop in SSD prices. In the past, SSDs have been used primarily for applications that demand the highest possible performance. Today we're seeing wider adoption, with SSDs being used as external caches to improve performance in a range of applications. Gigabyte for gigabyte, SSDs are still a lot more expensive than disk, but they are cheaper than piling on internal server memory.
Compared to hard drives, SSDs are not only faster for both reads and writes, they also support higher transfer rates and consume less power. On the downside, SSDs have limited life spans, because each cell in an SSD supports a limited number of writes.
[ Wondering where SSDs fit into your datacenter architecture? See "Four considerations for SSD deployment." ]
There are two types of SSDs: single-level cell (SLC) and multilevel cell (MLC). SLCs are faster than MLCs and last as much as 10 times longer (and, as you might imagine, cost a lot more). Write endurance has been a big barrier to SSDs, but increasing write specs and the smarter use of built-in DRAM caches are making the value proposition more attractive. Some manufacturers increase the longevity of drives by adding more actual capacity than the stated capacity, and they use wear-leveling algorithms to spread data over the extra cells.
But the most dramatic story is pricing. A 32GB SSD has gone from over $1,000 to under $100 in the last five years, though this is still about 46 times as expensive as a SATA drive in dollars per gigabyte. As new solutions to the wear problem emerge from the lab, we expect SSD adoption to accelerate even more, as the hunger for high performance in cloud computing and other widely shared applications increases.
-- Logan Harbaugh 5. NoSQL databases
Data is flowing everywhere like never before. And the days when "SQL" and "database" were interchangeable are fading fast, in part because old-fashioned relational databases can't handle the flood of data from Web 2.0 apps.
The hottest Web sites are spewing out terabytes of data that bear little resemblance to the rows and columns of numbers from the accounting department. Instead, the details of traffic are stored in flat files and analyzed by cron jobs running late at night. Diving into and browsing this data require a way to search for and collate information, which a relational database might be able to handle if it weren't so overloaded with mechanisms to keep the data consistent in even the worst possible cases.
[ In InfoWorld's "Slacker databases break all the old rules," Peter Wayner reviews four NoSQL databases: Amazon SimpleDB, CouchDB, Google App Engine, and Persevere of NoSQL. ]
Sure, you can make anything fit into a relational database with enough work, but that means you're paying for all of the sophisticated locking and rollback mechanisms developed for the accounting department to keep track of money. Unless the problem requires all of the sophistication and assurance of a top-of-the-line database, there's no need to invest in that overhead, or suffer its performance consequences.
The solution? Relax the strictures and come up with a new approach: NoSQL. Basic NoSQL databases are simple key/value pairs that bind together a key with a pile of attributes. There's no table filled with blank columns and no problem adding new ad hoc tags or values to each item. Transactions are optional.
Simple key/value pairs are just the start. Neo4J, for instance, offers a graph database that uses queries that are really routines for wandering around a network. If you want the names of the dogs of all of the friends of a friend, the query takes only a few lines to code.
The real game is keeping the features that are necessary while avoiding the ones that aren't. Project Cassandra, for instance, promises to offer consistent answers "eventually," which may be several seconds in a heavily loaded system. Neo4J requires the addition Lucene or some other indexing package if you want to look for particular nodes by name or content because Neo4J will only help you search through the network itself.
All of these new projects are just the latest to rediscover the speed that might be found by relaxing requirements. Look for more adjustments that relax the rules while enhancing backward compatibility and ease-of-use. And expect a new era of data processing like nothing we've experienced before.
-- Peter Wayner