As 'big data' grows, IT job roles, technology must change

IT managers say 'big data' movement continues to be a problem for many companies

LAS VEGAS -- As corporate data stores continues to grow, in some cases by 50 per cent-plus a year, the expanding task of managing them and mining for information is forcing a change in how IT workers are trained.

That was the consensus of IT managers from 24 companies who gathered at the EMC World user conference here this week to discuss so-called " big data " issues.

The discussion was sponsored by EMC's Isilon division, which manufactures clustered network-attached storage (NAS) systems used to house massive data warehouses under single domain name spaces.

Chris McNally, a storage architect with IT hosting company Sungard, said he is helping to cross-train employees to learn how various systems fit into a larger IT ecosystem.

For example, McNally said, AIX and backup administrators have volunteered to undergo storage area network (SAN) and cloud storage training at Sungard.

"So instead of me being the storage guy who has to argue with these other guys about how this works and what we do, they're now able make intelligent requests with regard to storage," he said. "It creates a better product in the end."

James Lowey, director of network and computer systems at genome sequencer company Translational Genomics Institute (TGen) said workers in his company's traditional IT shop are required to learn how networks, operating systems and storage interact.

Lowey said mapping human genomes currently creates 2TB worth of new data every week and he expects that to grow to 10TB per week by year's end. The genomic data is used to tailor drug compounds to treat diseases specific to a person's genetic profile.

He noted that coming up with the best way to mine that data for important information continues to be a sticking point.

Looking to come up with a solution to that problem, EMC last year spent more than $3 billion to acquire companies like Isilon and data warehousing and analytics company database Greenplum.

"Do I keep [data] or do I not keep it has been an age-old question," said Paul Rutherford, CTO of Isilon.

For Lowey, keeping all the data produced is an issue his company is struggling with. On one hand, there's nothing more personal than genomic data. So keeping everything means keeping everything secure for as long as you have it. But the data store also continues to be a good source of information that can be mined for its scientific value in creating custom drug treatments.

"The reason we keep everything forever is that we're not sure about what it is we have," he said. "In life sciences there's so much to learn, and so much unknown."

EMC announced here that it has ramped up programs to train and certify "data scientists." A data scientist spends his or her time determining the value of a corporation's data.

Nick Mehta, CEO of cloud storage provider LiveOffice, said data persists whether it's properly stored or not.

"For us, the issue is: how do you enable a world where you can keep everything cost effectively. We want a way to keep everything and then make it valuable. Having all that data helps us do our jobs better," he said.

LiveOffice currently stores some 4 petabytes of data on disk and adds another 5TB to that pool each day. LiveOffice encrypts all of the data for customer safety.

LiveOffice uses data analytics tools, such as map reduce technology like Hadoop, and distributed databases like Cassandra to mine massive data stores on Isilon arrays. It's a way to search data for legal discovery and regulatory compliance requests as well as insight into customers' habits.

Stephen Martino, director of production operations at Harvard Medical School, said the time is coming when there will be a demand from corporate users for mining services.

What IT managers need is a way to track who is using what, and that is ostensibly still missing from tools vendors provide, he said.

"A researcher has no boundaries on how much they can store, even 1TB to 2TB per day. I think the biggest struggle we have is you need to gather data that spells out who in the research lab is consuming data for chargeback," he said.

Paul English, director of IT at 3TIER, which provides extensive weather data to renewable energy companies, said his IT staff had been spending hours a day in meetings to figure out where data goes and who is responsible for managing it. "We've never not been dealing with big data," he said. "We want to keep 10 or 20 years of climatological data. We have growth potential of many petabytes."

To address the data deluge, his company installed 14 Isilon NAS arrays to create an expandable pool, accessible by anyone in his company.

"Now [capacity is] delivered more as a utility, he said.

One continuing issue, the IT managers said, is data movement - migrating it to the correctly priced storage tier and keeping it as close as possible to the people using it.

"You're talking terabytes per day that you can never keep up with on operations side," Martino said. "You can never get that data from one site to another.

The solution for Harvard Medical School was to use EMC's Isilon clustered NAS array, which provided a single name space to which any group could store and access data.

Lowey said TGen must constantly move data back and forth between gene sequencing computers in Phoenix and a supercomputer in Tempe that's used to process results.

"We had a one gigabit dedicated link. That didn't last. Now we have a 10Gibit [Ethernet] link, and we're actually playing with the idea of using InfiniBand," he said.

One current quandary the IT managers all agreed on was how big data is changing the way they think about storing information. Most said they want to store everything because they don't know what its value may be to the company at some later point in time.

"We're all in the same boat." Lowey said.

Lucas Mearian covers storage, disaster recovery and business continuity, financial services infrastructure and health care IT for Computerworld. Follow Lucas on Twitter at @lucasmearian, send e-mail to or subscribe to Lucas's RSS feed .

Read more about storage in Computerworld's Storage Topic Center.

Join the newsletter!


Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

Tags storageemcData Centerhardware systemse-discoveryConfiguration / maintenanceForensics

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.
Lucas Mearian

Lucas Mearian

Computerworld (US)
Show Comments

Cool Tech

Toys for Boys

Skywatcher Dobsonian 8″ Collapsible Telescope

Learn more >

Family Friendly

Whodunnit™ Duo-Scope MFL-007 Microscope Kit

Learn more >

Stocking Stuffer

Logitech Ultimate Ears Wonderboom 2 Bluetooth Speaker

Learn more >

Christmas Gift Guide

Click for more ›

Brand Post

Most Popular Reviews

Latest Articles


PCW Evaluation Team

Tom Sellers


This smart laptop was enjoyable to use and great to work on – creating content was super simple.

Lolita Wang


It really doesn’t get more “gaming laptop” than this.

Jack Jeffries


As the Maserati or BMW of laptops, it would fit perfectly in the hands of a professional needing firepower under the hood, sophistication and class on the surface, and gaming prowess (sports mode if you will) in between.

Taylor Carr


The MSI PS63 is an amazing laptop and I would definitely consider buying one in the future.

Christopher Low

Brother RJ-4230B

This small mobile printer is exactly what I need for invoicing and other jobs such as sending fellow tradesman details or step-by-step instructions that I can easily print off from my phone or the Web.

Aysha Strobbe

Microsoft Office 365/HP Spectre x360

Microsoft Office continues to make a student’s life that little bit easier by offering reliable, easy to use, time-saving functionality, while continuing to develop new features that further enhance what is already a formidable collection of applications

Featured Content

Product Launch Showcase

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?