#134 DNA: Heralding a New Era of Data Storage!

EmraNovember 25, 20245 Mins read1.7k Views

Azizul Haque is an expert scientist working in the fields of protein research and botany. He obtained his PhD from the University of North Carolina at Chapel Hill, USA, where he researched genetic factors related to plants’ ability to withstand environmental stress. He is currently conducting research at Howard University, focusing on the impact of the RACK1 protein in plant and cancer studies. His research has resulted in many successful innovations and patents. Read below for a detailed discussion.

The Dawn of a New Era in Data Storage:

The importance of data storage is immense in the history of human civilization. From ancient clay pots, stone tablets, and papyrus to modern cloud storage, the methods of storing information have evolved over the ages. The journey that began with the invention of the punch card in 1890 moved forward with various technological advancements—magnetic drum in 1932, Williams-Kilburn tube in 1947, magnetic tape drive and magnetic core in 1951, hard disk drive (HDD) in 1956, floppy disk in 1967, compact disk (CD) in 1982, zip drive in 1994, and the digital video disk (DVD) in 1995. These devices were large in size and could store relatively little data.

Currently, the hard drives, memory cards, or cloud storage we use are becoming smaller and smaller, but the amount of information is increasing so rapidly that their storage capacities will soon be exceeded. The volume of digital data produced worldwide each day may become a major challenge to manage in the future. Since 2020, the scale of data has grown so quickly that by 2040, it is projected to reach nearly 160 zettabytes (1 zettabyte = 1 billion terabytes = 1 trillion gigabytes). To store such a massive amount of information, we need new solutions beyond current technologies.

Seeking answers to this challenge, scientists have turned to our own genetic material—DNA. DNA (Deoxyribonucleic Acid) preserves the biological characteristics of our bodies, and it possesses tremendous capacity to store information. The concept of storing data in DNA is both ancient and robust, capable of withstanding nature’s toughest tests. Just 1 gram of DNA can store 215 petabytes (1 petabyte = 1,000 terabytes) of data. In comparison, a modern hard disk can only store a few terabytes, while the same physical space in DNA can store many times more. Additionally, technological limitations can cause data loss, but DNA can preserve information accurately for thousands of years. Analyzing fossils from ancient ice, scientists have proven that DNA can remain intact for millennia.

To understand DNA’s storage capacity, we need to know its fundamental components. DNA is the carrier and transmitter of hereditary information. DNA molecules consist of two long strands intertwined with each other. When these thread-like coiled DNAs are wrapped in a protein called histone, they become known as chromosomes. Inside the cell nucleus, these chromosomes mainly house DNA. DNA contains all of life’s characteristics and passes them down through generations.

This DNA is fundamentally composed of four nitrogen bases (A, G, C, T), which connect together to encode genetic information. A single complete human cell contains more than three billion nucleotides, which together form DNA. The human body contains approximately 37.2 trillion cells in total. Since these bases in DNA are arranged in a specific sequence, if they can be converted to binary codes like 0 and 1, digital data can be easily stored in DNA.

To store data in DNA, digital information must first be converted into binary (0 and 1). Then, these binaries are translated into DNA’s bases A, G, C, and T. For example, if A = 00, G = 01, C = 10, and T = 11, then the string 0111101011001101 would be converted to GTCCTATG. Using this process, any digital information can be stored in DNA.

Although this process sounds theoretically simple, it is in reality extremely complex. Firstly, DNA sequencing is still a very expensive procedure. In addition, to retrieve specific data, the entire DNA sequence must be processed, which is time-consuming and difficult. Moreover, the necessary technological algorithms and decoding processes for DNA data storage are still under development.

The idea of storing data in DNA may seem new, but research in this field has been ongoing for decades. The first proposal to use DNA as a data storage medium came in 1960. Because of DNA’s high data density and long-term durability, scientists began considering it as an information storage medium. However, due to technological limitations of the time, it took a while to bring this concept to life.

Significant advancements in DNA sequencing and synthesis technologies occurred in the 1990s. In 1999, Richard Hammersley tested the idea of storing small pieces of information in DNA. Around this time, scientists began researching the method of storing information using binary coding with DNA’s four bases (A, T, C, G). In 1999, scientists in New York were the first to store a short letter in DNA and successfully decode it again. In 2009, Canadian researchers stored an image, text, and audio file of a children’s rhyme (totaling 200 bytes) in DNA.

In 2012, Harvard University scientist George Church and his team became the first to store an entire book in DNA. They encoded and stored a 5.2-megabyte book in DNA and retrieved it. This was a significant milestone for data storage in DNA. During this period, a new method called DNA Fountain was developed, which enhanced efficiency in data storage.

In 2016, researchers from Microsoft and the University of Washington converted 200 megabytes of digital data into DNA, setting a new record. In 2017, scientists from Columbia University and the New York Genome Center successfully stored an entire operating system and 50 different files in DNA using the DNA Fountain technique. At this time, the cost of storing data in DNA also began to decrease significantly, expanding its practical possibilities.

Today, more advanced coding and decoding methods, automated DNA synthesis and sequencing techniques, and improved error-correction technologies are being used in DNA data storage. As a result, DNA information storage is now much easier and more accurate than ever before. Major technology companies like Microsoft, Illumina, and Twist Bioscience are working on various projects to advance DNA data storage.

Recently, young California researchers Nathaniel Roque and Hyunjoon Park have been working on a new method for storing data in DNA. They have founded a startup aiming to make DNA synthesis and sequencing even more affordable. In their process, pre-fabricated DNA is created, which is then encoded with messages using enzymes. This method allows the process to be completed at a much lower cost. They have already successfully accomplished the first kilobyte encoding.

Currently, DNA storage technology remains much more expensive and slower compared to conventional data storage systems. While storing data in DNA is still in its early stages, in the near future it could become a vital medium for storing all our digital information. If this DNA-based data storage technology can be effectively implemented, it could become a revolutionary solution for storing massive volumes of data, ushering in a safer and more advanced digital future!

On behalf of Scientist Org we extend our best wishes to Azizul Haque. He is a talented researcher and plant scientist, pioneering new horizons in modern technology. His outstanding work and contributions have brought him much acclaim. We hope his research will lead to new technological advancements and bring positive change to society. Thank you, Azizul Haque, and best wishes for even greater success in the future!