In the search for the perfect data-storage solution, scientists have been exploring creative ways to make use of nature’s original information-recording technology: DNA.
By translating the binary code of computers into the four-letter chemical code of genomics, they have discovered that entire libraries of digital information — everything from books to movies to Facebook posts — can be uploaded onto minuscule drops of synthetic DNA and later retrieved. Many experts predict that this approach could one day offer a cost-effective, environmentally friendly alternative to traditional silicon-based storage technology, which has led to the proliferation of enormous electricity-guzzling data centers around the world.
One problem that researchers working in this area have confronted, however, is that synthetic DNA tends to deteriorate over time. But now a team of Columbia scientists believe they have found a solution to this dilemma: encoding digital information straight into the genomes of living E. coli bacteria cells, which, they say, preserves the data in a surprisingly stable, robust manner.
“That a living cell could provide a more stable environment for storing data may seem counterintuitive, but a cell actually possesses sophisticated mechanisms for maintaining the integrity of its DNA and quickly correcting any genetic errors that may occur as a result of radiation, toxins, or other exposures,” says Harris H. Wang, an assistant professor of systems biology at Columbia University Irving Medical Center, who led the research. “Our approach exploits a cell’s natural quality-assurance measures.”
And by spreading individual bits of data over vast stretches of the E. coli genome, Wang’s team has demonstrated that the information gets safely passed down through successive generations of cells, even if mutations occur during cellular reproduction. “We’ve conducted experiments to show that the data is well preserved across hundreds of generations,” he says. “For all intents and purposes, this appears to be a reliable means of saving data permanently.”
Wang and his colleagues, who used the popular genetic-editing technology CRISPR-Cas to encode data into non-pathogenic strains of E. coli, are now working to improve the speed at which they can upload and retrieve data from the bacteria, since the process is currently far too slow for commercial use. But eventually, Wang says, it may be possible to replace towering stacks of computer hard drives with a population of E. coli cells that could fit in a test tube.
“The most important advantage in storing digital information in DNA is that it’s a medium that’s never going to go obsolete,” he says. “The double helix is always going to be the ideal storage technology, and our capacity to manipulate and read DNA is only going to improve.”