We’re creating more data than we know what to do with. The millions of emails and texts and tweets and YouTube videos are quickly building up in data storage centers. Thus far the answer to the issue has been to just build more and more data storage warehouses across the globe, but these data centers suck up immense amounts of power, take up miles of space, and are definitely not helping out the climate change crisis at stake. Since data has become such a crucial part of the world’s digital economy, we can’t ignore the dilemma we’ve created for ourselves: we’re completely overwhelming our data storage systems.
The International Data Corporation (IDC) estimates that by 2025, around 175 zettabytes of data will be created every year. To give some context, the IDC estimates that to this point, all of humanity has produced a grand total of 20 zettabytes. That means that in 2025 we could be producing more than 9 times the data per year than the whole earth has accumulated up to this point. Obviously, that amount of information isn’t easily stored on thumb drives, and it doesn’t fit in the cloud. The next step in the data storage timeline will need to be something entirely new to adapt to an entirely new data environment.
One solution that sounds more at home in a sci-fi novel is using DNA to store digital data.
Table of Contents
How DNA Data Storage Would Work
Every piece of digital information can be distilled into a string of ones and zeros-that’s how computers read the data, which they translate back into readable information. DNA data storage would work in a similar way to that binary coding system. Just like binary data, DNA is built of more basic strings of info. Instead of ones and zeroes, DNA uses the four base nucleotides: A, G, T, and C. And since there are more options using this quaternary coding rather than binary coding, you could store more information in a smaller amount of space. A lot smaller space.
A single gram of DNA can store around 215 petabytes of data. To put that number into perspective, one million gigabytes are in one petabyte. One teaspoon (about four grams) of DNA can store 903 petabytes, which means that a single teaspoon of DNA is only just a little short of holding as much data as the new 62,000 square-foot Facebook data center which holds 1000 petabytes of data. So yes, the possibility of DNA data storage drastically eclipses any and all current alternatives.
DNA Data Storage Roadblocks
Even though it seems like the perfect solution to our data storage problem, the research and development of the process is still in its infancy. One of the biggest challenges scientists face is the cost of this process. DNA itself can actually be replicated for pretty cheap, but you need an expensive piece of technology to encode the data into the DNA. Chemically synthesizing data costs upwards of $3,500 for just one megabyte of information.
That’s of course a hurdle for scientists, but also can be a positive when it comes to data security. Hacking private data stored in DNA would be next to impossible without the proper machinery, especially since it would most likely be stored in a liquid or powder format.
Another roadblock to making DNA storage a large scale solution is there is still a massive human element needed in the data encoding and decoding process. To encode the data into the quaternary language DNA is in, and then reverse that back into data readable to computers, humans currently have to be involved in every step of the process. Scientists have currently only been able to encode small pieces of data-but to use this storage system on a global scale, we’ll have to figure out how to automate the entire process.
Research Labs Are Still Making Progress
Even though there are some large issues that still need to be solved before DNA becomes a viable data storage solution, research labs are still working hard on finding a solution. Microsoft, along with the University of Washington, has been able to successfully encode a large variety of media onto DNA. Media like the Universal Declaration of Human Rights in more than 100 different languages, an OK Go music video, and Project Gutenberg’s top 100 books have all been translated into DNA language.
It’s important to note that current research has only been done on cold data. That is, archival data that isn’t accessed as often. Eventually, scientists will need to figure out how to do the same thing using real-time and frequently accessed data called hot data. That will be the real test in deciding whether DNA data storage could be a practical global storage option.
Data Storage’s Future
Big Data is baked into every industry and impacts how businesses and people learn, communicate, and live their daily lives. As technology advances and more data is created, we need to find innovative data storage solutions on how to store that information before it becomes lost. There are still cost and scalability problems that need to be resolved before DNA data storage becomes usable outside of a lab setting, but scientists and engineers are hopeful that it could be the solution the world is looking for.