German climate researchers prepare for rising seas of data

One upgrade is leading to another for IT staff at DKRZ, as a faster supercomputer threatens to flood an existing storage system with data

It's nice to have the latest kit, but a supercomputer upgrade is about to bring the German Climate Computing Center, DKRZ, a big problem: a shortage of space.

Not space for the computer itself, but for the data it generates.

DKRZ runs climate models on its supercomputer, projecting how our planet's weather will evolve over decades or even, in some cases, hundreds of millennia, from the last ice age and into the future.

All those models generate huge volumes of data -- 40 petabytes of so far -- that DKRZ archives for future reference, allowing researchers to analyze the models' output in different ways. The center also offers to store the output from climate models run by other supercomputing centers, forming a world climate studies archive drawn on by researchers around the world.

The center's supercomputer upgrade, switching from one using IBM's Power chips to an x86-based machine made by Bull, means that it will accumulate as much data every six months as it has over the whole of the last five years.

That's because it will be able to simulate atmospheric changes on a more detailed grid. "Every successful model produces four to 10 times more data," said DKRZ's Ulf Garternicht.

Garternicht leads the center's IT systems team, which is now commissioning a storage system capable of holding 500 PB or more.

It's using a new version of High Performance Storage System (HPSS), initially developed by IBM and the U.S. Department of Energy back in 1992. The military applications directorate of France's Atomic Energy Commission was also an early development partner, and the first applications of HPSS were for storing data from atomic weapons simulations, although now it is used for storing data for academic research and weather forecasting.

With HPSS, "The most important thing is the scalability," said Garternicht. "We could double it in size without having to change the architecture." That means DKRZ could ultimately use it to archive an exabyte of data.

Reliability is key too. The existing data archive is also built on HPSS and, said Garternicht, "We haven't lost any data over the last five years."

With such large volumes of data, being able to shift it quickly into and out of the archive is important. As in the existing system, most of the data will be stored on tape, but to speed things up there is also a disk cache.

As with any caching system, the goal is to keep the "hot" data in cache for as long as it's needed -- no mean task with the size of the datasets used in climate modelling.

The current system's cache holds 5 PB and can shift data in and out at a sustained rate of 3 Gbps, or 5 Gbps peak: The new one will hold 50 PB, and will run even faster.

The first phase of that system, holding 20 PB, is ready, and when the second one is complete, "We will see at least 15 Gbps of sustained performance, reading and writing concurrently," said Garternicht. Burst speeds could reach 18 Gbps.

IBM won the five-year contract to build and service the new system after having already worked on DKRZ's first HPSS installation for five years. The center is clearly happy with the company's work, but the system could still do with some improvements, Garternicht said.

"We want IBM to keep on providing the interfaces that our users are used to," he said. The venerable FTP (file transfer protocol) is a given for transferring such large datasets between institutions, "But we also want more modern interfaces like S3 or Swift," he said.

S3 is Amazon Web Services' cloud-based Simple Storage Service, while Swift is the distributed object store used in the OpenStack cloud operating system.

The outlook for our future climate, it would seem, is cloudy.

Peter Sayer covers general technology breaking news for IDG News Service, with a special interest in open source software and related European intellectual property legislation. Send comments and news tips to Peter at peter_sayer@idg.com.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection

Tags supercomputersIBMhardware systemsHigh performanceGerman Climate Computing Center

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Peter Sayer

IDG News Service
Show Comments

Cool Tech

Breitling Superocean Heritage Chronographe 44

Learn more >

SanDisk MicroSDXC™ for Nintendo® Switch™

Learn more >

Toys for Boys

Family Friendly

Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K

Learn more >

Stocking Stuffer

Razer DeathAdder Expert Ergonomic Gaming Mouse

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?