Comparative Study of a DNA Sequence Storage Technique: An Improved Coding Method for Information Storage of DNA from Characters to Numbers

Author(s):  
Crystal Cob ◽  
Soo-Yeon Ji
2016 ◽  
Author(s):  
Ian Holmes

1AbstractWe describe a strategy for constructing codes for DNA-based information storage by serial composition of weighted finite-state transducers. The resulting state machines can integrate correction of substitution errors; synchronization by interleaving watermark and periodic marker signals; conversion from binary to ternary, quaternary or mixed-radix sequences via an efficient block code; encoding into a DNA sequence that avoids homopolymer, dinucleotide, or trinucleotide runs and other short local repeats; and detection/correction of errors (including local duplications, burst deletions, and substitutions) that are characteristic of DNA sequencing technologies. We present software implementing these codes, available at https://github.com/ihh/dnastore, with simulation results demonstrating that the generated DNA is free of short repeats and can be accurately decoded even in the presence of substitutions, short duplications and deletions.


2019 ◽  
Vol 26 (7) ◽  
pp. 2159-2172
Author(s):  
Syed Mahamud Hossein ◽  
Debashis De ◽  
Pradeep Kumar Das Mohapatra

2018 ◽  
Author(s):  
Yeongjae Choi ◽  
Taehoon Ryu ◽  
Amos C. Lee ◽  
Hansol Choi ◽  
Hansaem Lee ◽  
...  

Introductory paragraphDNA-based data storage has emerged as a promising method to satisfy the exponentially increasing demand for information storage. However, practical implementation of DNA-based data storage remains a challenge because of the high cost of DNA per unit data. Here, we propose the use of eleven degenerate bases as encoding characters in addition to A, C, G, and T, which increases the information capacity (the amount of data that can be stored per length of DNA sequence designed) and reduce the cost of DNA per unit data. Using the proposed method, we experimentally achieved an information capacity of 3.37 bits/character, which is more than twice when compared to the highest information capacity previously achieved. Finally, the platform was projected to reduce the cost of DNA-based data storage by 50%.


Sign in / Sign up

Export Citation Format

Share Document