scholarly journals On Models and Capacity Bounds for DNA-based Storage Channels

Author(s):  
Zihui Yan ◽  
Cong Liang

In recent years, DNA-based systems have become a promising medium for long-term data storage. There are two layers of errors in DNA-based storage systems. The first is the dropouts of the DNA strands, which has been characterized in the shuffling-sampling channel. The second is insertions, deletions, and substitutions of nucleotides in individual DNA molecules. In this paper, we describe a DNA noisy synchronization error channel to characterize the errors in individual DNA molecules. We derive non-trivial lower and upper capacity bounds of the DNA noisy synchronization error channel based on information theory. By cascading these two channels, we provide theoretical capacity limits of the DNA storage system. These results reaffirm that DNA is a reliable storage medium with high storage density potential.

2021 ◽  
Author(s):  
Zihui Yan ◽  
Cong Liang

In recent years, DNA-based systems have become a promising medium for long-term data storage. There are two layers of errors in DNA-based storage systems. The first is the dropouts of the DNA strands, which has been characterized in the shuffling-sampling channel. The second is insertions, deletions, and substitutions of nucleotides in individual DNA molecules. In this paper, we describe a DNA noisy synchronization error channel to characterize the errors in individual DNA molecules. We derive non-trivial lower and upper capacity bounds of the DNA noisy synchronization error channel based on information theory. By cascading these two channels, we provide theoretical capacity limits of the DNA storage system. These results reaffirm that DNA is a reliable storage medium with high storage density potential.


2021 ◽  
Author(s):  
Min Li ◽  
Junbiao Dai ◽  
Qingshan Jiang ◽  
Yang Wang

Abstract Current research on DNA storage usually focuses on the improvement of storage density with reduced gene synthesis cost by developing effective encoding and decoding schemes while lacking the consideration on the uncertainty in ultra long-term data storage and retention. Consequently, the current DNA storage systems are often not self-containment, implying that they have to resort to external tools for the restoration of the stored gene data. This may result in high risks in data loss since the required tools might not be available due to the high uncertainty in far future. To address this issue, we propose in this paper a self-contained DNA storage system that can make self-explanatory to its stored data without relying on any external tools. To this end, we design a specific DNA file format whereby a separate storage scheme is developed to reduce the data redundancy while an effective indexing is designed for random read operations to the stored data file. We verified through experimental data that the proposed self-contained and self-explanatory method can not only get rid of the reliance on external tools for data restoration but also minimize the data redundancy brought about when the amount of data to be stored reaches a certain scale.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Min Li ◽  
Jiashu Wu ◽  
Junbiao Dai ◽  
Qingshan Jiang ◽  
Qiang Qu ◽  
...  

AbstractCurrent research on DNA storage usually focuses on the improvement of storage density by developing effective encoding and decoding schemes while lacking the consideration on the uncertainty in ultra-long-term data storage and retention. Consequently, the current DNA storage systems are often not self-contained, implying that they have to resort to external tools for the restoration of the stored DNA data. This may result in high risks in data loss since the required tools might not be available due to the high uncertainty in far future. To address this issue, we propose in this paper a self-contained DNA storage system that can bring self-explanatory to its stored data without relying on any external tool. To this end, we design a specific DNA file format whereby a separate storage scheme is developed to reduce the data redundancy while an effective indexing is designed for random read operations to the stored data file. We verified through experimental data that the proposed self-contained and self-explanatory method can not only get rid of the reliance on external tools for data restoration but also minimise the data redundancy brought about when the amount of data to be stored reaches a certain scale.


2020 ◽  
Vol 56 (25) ◽  
pp. 3613-3616 ◽  
Author(s):  
A. Xavier Kohll ◽  
Philipp L. Antkowiak ◽  
Weida D. Chen ◽  
Bichlien H. Nguyen ◽  
Wendelin J. Stark ◽  
...  

Mimicking fossil bone, a storage system involving earth alkali salts enables the preservation of digital data in DNA.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Kyle J. Tomek ◽  
Kevin Volkel ◽  
Elaine W. Indermaur ◽  
James M. Tuck ◽  
Albert J. Keung

AbstractDNA holds significant promise as a data storage medium due to its density, longevity, and resource and energy conservation. These advantages arise from the inherent biomolecular structure of DNA which differentiates it from conventional storage media. The unique molecular architecture of DNA storage also prompts important discussions on how data should be organized, accessed, and manipulated and what practical functionalities may be possible. Here we leverage thermodynamic tuning of biomolecular interactions to implement useful data access and organizational features. Specific sets of environmental conditions including distinct DNA concentrations and temperatures were screened for their ability to switchably access either all DNA strands encoding full image files from a GB-sized background database or subsets of those strands encoding low resolution, File Preview, versions. We demonstrate File Preview with four JPEG images and provide an argument for the substantial and practical economic benefit of this generalizable strategy to organize data.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Philipp L. Antkowiak ◽  
Jory Lietard ◽  
Mohammad Zalbagi Darestani ◽  
Mark M. Somoza ◽  
Wendelin J. Stark ◽  
...  

Abstract Due to its longevity and enormous information density, DNA is an attractive medium for archival storage. The current hamstring of DNA data storage systems—both in cost and speed—is synthesis. The key idea for breaking this bottleneck pursued in this work is to move beyond the low-error and expensive synthesis employed almost exclusively in today’s systems, towards cheaper, potentially faster, but high-error synthesis technologies. Here, we demonstrate a DNA storage system that relies on massively parallel light-directed synthesis, which is considerably cheaper than conventional solid-phase synthesis. However, this technology has a high sequence error rate when optimized for speed. We demonstrate that even in this high-error regime, reliable storage of information is possible, by developing a pipeline of algorithms for encoding and reconstruction of the information. In our experiments, we store a file containing sheet music of Mozart, and show perfect data recovery from low synthesis fidelity DNA.


2001 ◽  
Vol 19 (7) ◽  
pp. 247-250 ◽  
Author(s):  
Jonathan P.L Cox
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document