Combinatorial constraint coding based on the EORS algorithm in DNA storage

The development of information technology has produced massive amounts of data, which has brought severe challenges to information storage. Traditional electronic storage media cannot keep up with the ever-increasing demand for data storage, but in its place DNA has emerged as a feasible storage medium with high density, large storage capacity and strong durability. In DNA data storage, many different approaches can be used to encode data into codewords. DNA coding is a key step in DNA storage and can directly affect storage performance and data integrity. However, since errors are prone to occur in DNA synthesis and sequencing, and non-specific hybridization is prone to occur in the solution, how to effectively encode DNA has become an urgent problem to be solved. In this article, we propose a DNA storage coding method based on the equilibrium optimization random search (EORS) algorithm, which meets the Hamming distance, GC content and no-runlength constraints and can reduce the error rate in storage. Simulation experiments have shown that the size of the DNA storage code set constructed by the EORS algorithm that meets the combination constraints has increased by an average of 11% compared with previous work. The increase in the code set means that shorter DNA chains can be used to store more data.

Download Full-text

Driving the scalability of DNA-based information storage systems

10.1101/591594 ◽

2019 ◽

Author(s):

Kyle J. Tomek ◽

Kevin Volkel ◽

Alexander Simpson ◽

Austin G. Hass ◽

Elaine W. Indermaur ◽

...

Keyword(s):

Data Storage ◽

Storage Systems ◽

Information Storage ◽

Maximum Capacity ◽

Theoretical Maximum ◽

New Approaches ◽

File Access ◽

Storage Media ◽

Dna Storage ◽

Address System

ABSTRACTThe extreme density of DNA presents a compelling advantage over current storage media; however, in order to reach practical capacities, new approaches for organizing and accessing information are needed. Here we use chemical handles to selectively extract unique files from a complex database of DNA mimicking 5 TB of data and design and implement a nested file address system that increases the theoretical maximum capacity of DNA storage systems by five orders of magnitude. These advancements enable the development and future scaling of DNA-based data storage systems with reasonable modern capacities and file access capabilities.

Download Full-text

Promiscuous molecules for smarter file operations in DNA-based data storage

Nature Communications ◽

10.1038/s41467-021-23669-w ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Kyle J. Tomek ◽

Kevin Volkel ◽

Elaine W. Indermaur ◽

James M. Tuck ◽

Albert J. Keung

Keyword(s):

Data Storage ◽

Data Access ◽

Molecular Architecture ◽

Biomolecular Structure ◽

Jpeg Images ◽

Storage Media ◽

Dna Storage ◽

Dna Strands ◽

Background Database ◽

Organizational Features

AbstractDNA holds significant promise as a data storage medium due to its density, longevity, and resource and energy conservation. These advantages arise from the inherent biomolecular structure of DNA which differentiates it from conventional storage media. The unique molecular architecture of DNA storage also prompts important discussions on how data should be organized, accessed, and manipulated and what practical functionalities may be possible. Here we leverage thermodynamic tuning of biomolecular interactions to implement useful data access and organizational features. Specific sets of environmental conditions including distinct DNA concentrations and temperatures were screened for their ability to switchably access either all DNA strands encoding full image files from a GB-sized background database or subsets of those strands encoding low resolution, File Preview, versions. We demonstrate File Preview with four JPEG images and provide an argument for the substantial and practical economic benefit of this generalizable strategy to organize data.

Download Full-text

An Intelligent Optimization Algorithm for Constructing a DNA Storage Code: NOL-HHO

International Journal of Molecular Sciences ◽

10.3390/ijms21062191 ◽

2020 ◽

Vol 21 (6) ◽

pp. 2191 ◽

Cited By ~ 4

Author(s):

Qiang Yin ◽

Ben Cao ◽

Xue Li ◽

Bin Wang ◽

Qiang Zhang ◽

...

Keyword(s):

Optimization Algorithm ◽

Dna Sequences ◽

Learning Strategy ◽

Hamming Distance ◽

Experimental Testing ◽

Gc Content ◽

Smooth Transition ◽

Local Optima ◽

Dna Storage

The high density, large capacity, and long-term stability of DNA molecules make them an emerging storage medium that is especially suitable for the long-term storage of large datasets. The DNA sequences used in storage need to consider relevant constraints to avoid nonspecific hybridization reactions, such as the No-runlength constraint, GC-content, and the Hamming distance. In this work, a new nonlinear control parameter strategy and a random opposition-based learning strategy were used to improve the Harris hawks optimization algorithm (for the improved algorithm NOL-HHO) in order to prevent it from falling into local optima. Experimental testing was performed on 23 widely used benchmark functions, and the proposed algorithm was used to obtain better coding lower bounds for DNA storage. The results show that our algorithm can better maintain a smooth transition between exploration and exploitation and has stronger global exploration capabilities as compared with other algorithms. At the same time, the improvement of the lower bound directly affects the storage capacity and code rate, which promotes the further development of DNA storage technology.

Download Full-text

DNA stability: a central design consideration for DNA data storage systems

Nature Communications ◽

10.1038/s41467-021-21587-5 ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 2

Author(s):

Karishma Matange ◽

James M. Tuck ◽

Albert J. Keung

Keyword(s):

Data Storage ◽

Molecular Mechanisms ◽

Storage Conditions ◽

Information Storage ◽

Processing Conditions ◽

Energy Materials ◽

Specific Design ◽

Dna Stability ◽

Design Considerations ◽

Dna Storage

AbstractData storage in DNA is a rapidly evolving technology that could be a transformative solution for the rising energy, materials, and space needs of modern information storage. Given that the information medium is DNA itself, its stability under different storage and processing conditions will fundamentally impact and constrain design considerations and data system capabilities. Here we analyze the storage conditions, molecular mechanisms, and stabilization strategies influencing DNA stability and pose specific design configurations and scenarios for future systems that best leverage the considerable advantages of DNA storage.

Download Full-text

Addition of Degenerate Bases to DNA-based Data Storage for Increased Information Capacity

10.1101/367052 ◽

2018 ◽

Cited By ~ 1

Author(s):

Yeongjae Choi ◽

Taehoon Ryu ◽

Amos C. Lee ◽

Hansol Choi ◽

Hansaem Lee ◽

...

Keyword(s):

Dna Sequence ◽

Data Storage ◽

Information Storage ◽

Promising Method ◽

Information Capacity ◽

Practical Implementation ◽

Increasing Demand ◽

The Cost

Introductory paragraphDNA-based data storage has emerged as a promising method to satisfy the exponentially increasing demand for information storage. However, practical implementation of DNA-based data storage remains a challenge because of the high cost of DNA per unit data. Here, we propose the use of eleven degenerate bases as encoding characters in addition to A, C, G, and T, which increases the information capacity (the amount of data that can be stored per length of DNA sequence designed) and reduce the cost of DNA per unit data. Using the proposed method, we experimentally achieved an information capacity of 3.37 bits/character, which is more than twice when compared to the highest information capacity previously achieved. Finally, the platform was projected to reduce the cost of DNA-based data storage by 50%.

Download Full-text

Deoxyribonucleic Acid as a Tool for Digital Information Storage: An Overview

THE INDIAN JOURNAL OF VETERINARY SCIENCES AND BIOTECHNOLOGY ◽

10.21887/ijvsbt.15.1.1 ◽

2019 ◽

Vol 15 (01) ◽

pp. 1-8

Author(s):

Ashish C Patel ◽

C G Joshi

Keyword(s):

Data Storage ◽

Dna Sequences ◽

Consensus Sequence ◽

Random Access ◽

Information Storage ◽

Digital Data ◽

Digital Information ◽

Multiple Sequence ◽

Digital World ◽

Digital File

Current data storage technologies cannot keep pace longer with exponentially growing amounts of data through the extensive use of social networking photos and media, etc. The "digital world” with 4.4 zettabytes in 2013 has predicted it to reach 44 zettabytes by 2020. From the past 30 years, scientists and researchers have been trying to develop a robust way of storing data on a medium which is dense and ever-lasting and found DNA as the most promising storage medium. Unlike existing storage devices, DNA requires no maintenance, except the need to store at a cool and dark place. DNA has a small size with high density; just 1 gram of dry DNA can store about 455 exabytes of data. DNA stores the informations using four bases, viz., A, T, G, and C, while CDs, hard disks and other devices stores the information using 0’s and 1’s on the spiral tracks. In the DNA based storage, after binarization of digital file into the binary codes, encoding and decoding are important steps in DNA based storage system. Once the digital file is encoded, the next step is to synthesize arbitrary single-strand DNA sequences and that can be stored in the deep freeze until use.When there is a need for information to be recovered, it can be done using DNA sequencing. New generation sequencing (NGS) capable of producing sequences with very high throughput at a much lower cost about less than 0.1 USD for one MB of data than the first sequencing technologies. Post-sequencing processing includes alignment of all reads using multiple sequence alignment (MSA) algorithms to obtain different consensus sequences. The consensus sequence is decoded as the reversal of the encoding process. Most prior DNA data storage efforts sequenced and decoded the entire amount of stored digital information with no random access, but nowadays it has become possible to extract selective files (e.g., retrieving only required image from a collection) from a DNA pool using PCR-based random access. Various scientists successfully stored up to 110 zettabytes data in one gram of DNA. In the future, with an efficient encoding, error corrections, cheaper DNA synthesis,and sequencing, DNA based storage will become a practical solution for storage of exponentially growing digital data.

Download Full-text

The development and application of new media technology in news communication industry

International Journal of Electrical Engineering Education ◽

10.1177/0020720921996593 ◽

2021 ◽

pp. 002072092199659

Author(s):

Hong Guo

Keyword(s):

New Media ◽

Data Storage ◽

News Media ◽

Modern Society ◽

Information Storage ◽

Media Technology ◽

Communication Industry ◽

Media Technologies ◽

Tv News ◽

New Media Technologies

Many new media technologies have emerged in modern society. The application of new media technologies has impacted traditional TV news media, which not only faces great challenges, but also brings some lessons for the development of TV news media. New media technology relies on powerful information processing technology and data storage technology to develop and grow continuously. Compared with traditional news, new media technology has more powerful information storage capacity and dissemination capacity. Firstly, this paper briefly introduces the concept of new media technology, summarizes the typical characteristics of new media technology, and analyzes the existing problems in the application of new media technology in the news communication industry based on the necessity of applying new media technology. Finally, some Suggestions are put forward based on this, hoping to provide some reference for the development of news communication industry.

Download Full-text