Microsoft, university researchers break DNA data storage record

University of Washington and Microsoft researchers believe they have broken the record for storing and retrieving data in DNA molecules.

Researchers said the impressive part about reaching the 200MB milestone is not just how much data they were able to encode onto synthetic DNA and then decode, it's also the space they were able to store it in.

Once encoded, the data occupied a spot in a test tube "much smaller than the tip of a pencil," Douglas Carmean, the partner architect at Microsoft overseeing the project, said.

The DNA storage also has a half-life of 500 years, even in harsh conditions. The half-life of DNA -- just as with radioactive material -- determines its rate of decay or the length of time it takes half of its strand bonds to break.

Overall, though, this is a huge step forward. "Think of the amount of data in a big data center compressed into a few sugar cubes. Or all the publicly accessible data on the Internet slipped into a shoebox. That is the promise of DNA storage -- once scientists are able to scale the technology and overcome a series of technical hurdles," Microsoft stated in a blog.

The data stored on the molecular DNA included digital versions of works of art, including a high-definition music video by the band OK Go!, the Universal Declaration of Human Rights in more than 100 languages, the top 100 books of Project Guttenberg and the nonprofit Crop Trust's seed database on DNA strands.

DNA is needed as a storage medium because the world's data is growing exponentially and molecular-level storage is vastly more dense than hard drives, solid state drives (SSDs) or even up-and-coming technologies such as phase-change memory.

"Those systems also degrade after a few years or decades, while DNA can reliably preserve information for centuries," the University of Washington (UW) researchers stated in a news release. "DNA is best suited for archival applications, rather than instances where files need to be accessed immediately."

DNA data storage Tara Brown Photography/University of Washington

UW Associate Professor Luis Henrique Ceze, in blue, and research scientist Lee Organick prepare DNA containing digital data for sequencing, which allows them to read and retrieve the original files.

The UW and Microsoft researchers are one of two teams nationwide that have also demonstrated the ability to perform random access of data from a pool of molecules, which they described as a task similar to reassembling one chapter of a story from a library of torn books.

The researchers said they developed "a novel approach" to convert the long strings of ones and zeroes in digital data into the four basic building blocks of DNA sequences -- adenine, guanine, cytosine and thymine -- represented as As, Gs, Cs and Ts.

The digital data is broken down into pieces and stored by synthesizing it as a massive number of tiny DNA molecules, which can be dehydrated and preserved for long-term storage.

While advances in DNA storage rely on techniques pioneered by the biotechnology industry, it also requires lessons learned from information technology. For example, the Microsoft and UW team's encoding approach uses error correction schemes commonly used in computer memory.

"This is an example where we're borrowing something from nature -- DNA -- to store information. But we're using something we know from computers -- how to correct memory errors -- and applying that back to nature," said Luis Henrique Ceze, a UW associate professor of computer science and engineering and the university's principal researcher on the project.

To access the stored data, the researchers encode the equivalent of zip codes and street addresses into the DNA sequences. Polymerase Chain Reaction (PCR) techniques -- commonly used in molecular biology -- help them more easily identify the zip codes they are looking for.

Using DNA sequencing techniques, the researchers can then read the data and convert it back to a video, image or document file by using the street addresses to reorder the data.

Most of the world's data today is stored on magnetic and optical media. Tape technology has recently seen significant density improvements with tape cartridges as large as 185TB, and is the densest form of storage available commercially today, at about 10GB per millimeter (mm). Recent research reported feasibility of optical discs capable of storing 1PB, yielding a density of about 100GB/mm. Despite this improvement, storing zettabytes of data would still take millions of units, and use significant physical space.

DNA National Human Genome Research Institute

A depiction of a DNA double helix.

DNA has a theoretical limit above one exabyte per millimeter, which is eight orders of magnitude denser than tape. DNA-based storage also has the benefit of eternal relevance: As long as there is DNA-based life, there will be strong reasons to read and manipulate DNA, the researchers stated in an April research paper.

According to the ongoing "Digital Universe" study by IDC and EMC, the amount of data is forecast to grow to over 16 zettabytes (ZB) in 2017. The Internet of Things, in large part, will be responsible for doubling digital data every two years, resulting in 44 trillion gigabytes (44ZB) by 2020.

"A significant fraction of this data is in archival form; for example, Facebook recently built an entire data center dedicated to 1 exabyte of cold storage," the scientists stated in their research paper.

Researchers have been experimenting with DNA as a data-storage medium for more than a dozen years, but it has progressed quickly. In 1999, DNA-based storage involved encoding and recovering just a 23-character message.

dna photos data storage Microsoft

In April, Microsoft and GW researchers were able to store these three image files, which were synthesized and sequenced in DNA.

By 2013, scientists from U.K.-based EMBL-European Bioinformatics Institute claimed they'd encoded an MP3 version of Martin Luther King's "I Have a Dream" speech in DNA.

In April, Microsoft and UW researchers released their paper detailing how synthetic DNA could be used as a form of archival storage.

"DNA is an amazing information storage molecule that encodes data about how a living system works. We're repurposing that capacity to store digital data -- pictures, videos, documents," Ceze said. "This is one important example of the potential of borrowing from nature to build better computer systems."

Join the PC World newsletter!

Error: Please check your email address.

Our Back to Business guide highlights the best products for you to boost your productivity at home, on the road, at the office, or in the classroom.

Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Lucas Mearian

Computerworld (US)
Show Comments

Essentials

Lexar® JumpDrive® S57 USB 3.0 flash drive

Learn more >

Microsoft L5V-00027 Sculpt Ergonomic Keyboard Desktop

Learn more >

Mobile

Lexar® JumpDrive® S45 USB 3.0 flash drive 

Learn more >

Exec

Lexar® Professional 1800x microSDHC™/microSDXC™ UHS-II cards 

Learn more >

Lexar® JumpDrive® C20c USB Type-C flash drive 

Learn more >

HD Pan/Tilt Wi-Fi Camera with Night Vision NC450

Learn more >

Audio-Technica ATH-ANC70 Noise Cancelling Headphones

Learn more >

Budget

Back To Business Guide

Click for more ›

Most Popular Reviews

Latest News Articles

Resources

PCW Evaluation Team

Michael Hargreaves

Windows 10 for Business / Dell XPS

I’d happily recommend this touchscreen laptop and Windows 10 as a great way to get serious work done at a desk or on the road.

Aysha Strobbe

Windows 10 / HP Spectre

Ultimately, I think the Windows 10 environment is excellent for me as it caters for so many different uses. The inclusion of the Xbox app is also great for when you need some downtime too!

Mark Escubio

Windows 10 / Lenovo Yoga

For me, the Xbox Play Anywhere is a great new feature as it allows you to play your current Xbox games with higher resolutions and better graphics without forking out extra cash for another copy. Although available titles are still scarce, but I’m sure it will grow in time.

Kathy Cassidy

STYLISTIC Q702

First impression on unpacking the Q702 test unit was the solid feel and clean, minimalist styling.

Anthony Grifoni

STYLISTIC Q572

For work use, Microsoft Word and Excel programs pre-installed on the device are adequate for preparing short documents.

Featured Content

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?