An easier-to-use technique for storing data in DNA is inspired by our cells
The new method, published in Nature last week, is more efficient, storing 350 bits at a time by encoding strands in parallel. Rather than hand-threading each DNA strand, the team assembles strands from pre-built DNA bricks about 20 nucleotides long, encoding information by altering some and not others along the way. Peking University’s Long Qian and team got the idea for such templates from the way cells share the same basic set of genes but behave differently in response to chemical changes in DNA strands. “Every cell in our bodies has the same genome sequence, but genetic programming comes from modifications to DNA. If life can do this, we can do this,” she says.
Qian and her colleagues encoded data through methylation, a chemical reaction that switches genes on and off by attaching a methyl compound—a small methane-related molecule. Once the bricks are locked into their assigned spots on the strand, researchers select which bricks to methylate, with the presence or absence of the modification standing in for binary values of 0 or 1. The information can then be deciphered using nanopore sequencers to detect whether a brick has been methylated. In theory, the new method is simple enough to be carried out without detailed knowledge of how to manipulate DNA.
The storage capacity of each DNA strand caps off at roughly 70 bits. For larger files, researchers splintered data into multiple strands identified by unique barcodes encoded in the bricks. The strands were then read simultaneously and sequenced according to their barcodes. With this technique, researchers encoded the image of a tiger rubbing from the Han dynasty, troubleshooting the encoding process until the image came back with no errors. The same process worked for more complex images, like a photorealistic print of a panda.
To gauge the real-world applicability of their approach, the team enlisted 60 students from diverse academic backgrounds—not just scientists—to encode any writing of their choice. The volunteers transcribed their writing into binary code through a web server. Then, with a kit sent by the team, they pipetted an enzyme into a 96-well plate of the DNA bricks, marking which would be methylated. The team then ran the samples through a sequencer to make the DNA strand. Once the computer received the sequence, researchers ran a decoding algorithm and sent the restored message back to a web server for students to retrieve with a password. The writing came back with a 1.4% error rate in letters, and the errors were eventually corrected through language-learning models.
Once it’s more thoroughly developed, Qian sees the technology becoming useful as long-term storage for archival information that isn’t accessed every day, like medical records, financial reports, or scientific data.
The success nonscientists achieved using the technique in coding trials suggests that the DNA storage could eventually become a practical technology. “Everyone is storing data every day, and so to compete with traditional data storage technologies, DNA methods need to be usable by the everyday person,” says Jeff Nivala, co-director of University of Washington’s Molecular Information Systems Lab. “This is still an early demonstration of going toward nonexperts, but I think it’s pretty unique that they’re able to do that.”
DNA storage still has many strides left to make before it can compete with traditional data storage. The new system is more expensive than either traditional data storage techniques or previous DNA-synthesis methods, Nivala says, though the encoding process could become more efficient with automation on a larger scale. With future development, template-based DNA storage might become a more secure method of tackling ever-climbing data demands.