Cloud Computing: How big is big data? IDC's answer

The Digital Universe to grow by 1.2 zetabytes this year

I came across a link to a new report from IDC called the "2010 Digital Universe Study". The report echoes what we've been telling our clients for the past year: the projections of the past few years about the growth of data significantly underestimate how much data is going to be created.

Some highlights of the report:

  • In 2010, the Digital Universe (a fancy term for all the data created by consumers and businesses on earth, including video, audio, documents, etc.) will grow by 1.2 zettabytes, or 1.2 million petabytes.
  • By 2020, the Digital Universe will be 44 times as large as it was in 2009.
  • Surprisingly, the number of objects (i.e., files that contain digital data) will increase faster than the total amount of data, due to smaller file sizes - even though lots of large video and audio files are being created, so are massive amounts of small files created by devices, sensors, etc.

The report goes on to highlight some of the biggest issues the future torrent of data will pose:

  • Searching: How to find a digital needle in a gigantic data haystack? Most of the data will be unstructured, implying new kinds of searching mechanisms are required.
  • Data Tiers: If you thought Hierarchical Storage Management was important before, imagine how necessary it will be in the face of zettabytes of data. A strategy to define a layered approach to storage, based on historical use, immediacy of need, and cost of storage will be necessary.
  • Privacy and Compliance: How can the increasing requirements of privacy and compliance be controlled with so much data under management?
  • Headcount Mismatch: While the amount of data will increase 44 times, and the number of files will increase 67 times, the number of employees will increase by only 1.4 times.

The report notes that by 2020, much of this data will be held in cloud environments or will be "touched by cloud," which means data that transits through a cloud service or is temporarily held in a cloud application. The report estimates that perhaps 15% of all data will be held in the cloud, and that around one-third will live in or pass through the cloud. Frankly, I think that underestimates what's going to be in the cloud, for this reason:

It's clear that the growth of data is accelerating, which is to say that much of it will be created later in the 2010 - 2020 decade. This means that the average corporation is going to experience an increasing deluge of data - in other words, no matter what level of investment they've already got in storage, it will be accelerating as the decade goes on. This will require ever-increasing amounts of storage and an ever-increasing capital budget for storage devices - not to mention more headcount. There's a truism in economics that something that can't go on, won't go on. I just don't see most companies funding an ever-increasing number of storage devices and employees to manage them, i.e., most companies can't afford the projected growth of storage, so they won't go down the road of on-site storage. Long before they get to the logical conclusion of how much investment, capital, and headcount is required to manage the increased storage, they'll turn to specialized providers who have figured out how to manage enormous amounts of storage more cost-effectively.

Another reason the report underestimates how much data will be in the cloud is that much of the data will, increasingly, originate in the cloud, because of the use of SaaS applications and the hosting of custom applications in IaaS clouds. Just as the rate of change in storage amounts will increase through the decade, so too will the number of cloud applications - which means the data associated with those apps will be created in the cloud to begin with. Another way to look at this is: what proportion of applications do you think will reside in external cloud environments by 2020? I'm betting it's significantly more than 15% of all apps.

The report then turns to privacy and compliance issues and concludes that, despite the best efforts of IT groups, the proportion of data left inadequately protected will increase throughout the next decade, due to the lack of investment made available by the business units that fund central IT expenditure. Unless driven by specific legal requirements (e.g., SarbOx) or actual data breaches, data privacy and compliance runs a poor second (and a long way back) to the functionality requirements of business units. And this is not to mention the use of external cloud providers by business units, which makes eliding IT data security requirements even easier.

The report concludes with some predictions:

  • The increased complexity of managing digital information will be an incentive to move to cloud services.
  • Within data centers, expect continued pressure for data center automation, consolidation, and virtualization.
  • Expect more end-user self-service.
  • Expect bottlenecks in key specialties such as security, information management, advanced content management, and real-time processing.

If you work in IT, you owe it to yourself to read this report and consider its implications. I may sound like a broken record, but the future of IT is going to look a lot different than the past -- even the recent past. This report offers information to guide your strategy. We've put together some specific advice about steps you should be taking now to prepare for the very different future heading toward us. If you're interested in getting a jump start on "The Digital Universe," take a look here.

Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of "Virtualization for Dummies," the best-selling book on virtualization to date.

Join the PC World newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags dataIDCcloud computingvirtualisation

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Bernard Golden

Show Comments

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Matthew Stivala

HP OfficeJet 250 Mobile Printer

The HP OfficeJet 250 Mobile Printer is a great device that fits perfectly into my fast paced and mobile lifestyle. My first impression of the printer itself was how incredibly compact and sleek the device was.

Armand Abogado

HP OfficeJet 250 Mobile Printer

Wireless printing from my iPhone was also a handy feature, the whole experience was quick and seamless with no setup requirements - accessed through the default iOS printing menu options.

Azadeh Williams

HP OfficeJet Pro 8730

A smarter way to print for busy small business owners, combining speedy printing with scanning and copying, making it easier to produce high quality documents and images at a touch of a button.

Andrew Grant

HP OfficeJet Pro 8730

I've had a multifunction printer in the office going on 10 years now. It was a neat bit of kit back in the day -- print, copy, scan, fax -- when printing over WiFi felt a bit like magic. It’s seen better days though and an upgrade’s well overdue. This HP OfficeJet Pro 8730 looks like it ticks all the same boxes: print, copy, scan, and fax. (Really? Does anyone fax anything any more? I guess it's good to know the facility’s there, just in case.) Printing over WiFi is more-or- less standard these days.

Ed Dawson

HP OfficeJet Pro 8730

As a freelance writer who is always on the go, I like my technology to be both efficient and effective so I can do my job well. The HP OfficeJet Pro 8730 Inkjet Printer ticks all the boxes in terms of form factor, performance and user interface.

Michael Hargreaves

Windows 10 for Business / Dell XPS 13

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?