scholarly journals Secure data storage on DNA hard drives

2019 ◽  
Author(s):  
Kaikai Chen ◽  
Jinbo Zhu ◽  
Filip Boskovic ◽  
Ulrich F. Keyser

AbstractDNA is emerging as a novel material for digital data storage. The two main challenges are efficient encoding and data security. Here, we develop an approach that allows for writing and erasing data by relying solely on Watson-Crick base pairing of short oligonucleotides to single-stranded DNA overhangs located along a long double-stranded DNA hard drive (DNA-HD). Our enzyme-free system enables fast synthesis-free data writing with predetermined building blocks. The use of DNA base pairing allows for secure encryption on DNA-HDs that requires a physical key and nanopore sensing for decoding. The system is suitable for miniature integration for an end-to-end DNA storage device. Our study opens a novel pathway for rewritable and secure data storage with DNA.One Sentence SummaryStoring digital information on molecules along DNA hard drives for rewritable and secure data storage.


2019 ◽  
Vol 15 (01) ◽  
pp. 1-8
Author(s):  
Ashish C Patel ◽  
C G Joshi

Current data storage technologies cannot keep pace longer with exponentially growing amounts of data through the extensive use of social networking photos and media, etc. The "digital world” with 4.4 zettabytes in 2013 has predicted it to reach 44 zettabytes by 2020. From the past 30 years, scientists and researchers have been trying to develop a robust way of storing data on a medium which is dense and ever-lasting and found DNA as the most promising storage medium. Unlike existing storage devices, DNA requires no maintenance, except the need to store at a cool and dark place. DNA has a small size with high density; just 1 gram of dry DNA can store about 455 exabytes of data. DNA stores the informations using four bases, viz., A, T, G, and C, while CDs, hard disks and other devices stores the information using 0’s and 1’s on the spiral tracks. In the DNA based storage, after binarization of digital file into the binary codes, encoding and decoding are important steps in DNA based storage system. Once the digital file is encoded, the next step is to synthesize arbitrary single-strand DNA sequences and that can be stored in the deep freeze until use.When there is a need for information to be recovered, it can be done using DNA sequencing. New generation sequencing (NGS) capable of producing sequences with very high throughput at a much lower cost about less than 0.1 USD for one MB of data than the first sequencing technologies. Post-sequencing processing includes alignment of all reads using multiple sequence alignment (MSA) algorithms to obtain different consensus sequences. The consensus sequence is decoded as the reversal of the encoding process. Most prior DNA data storage efforts sequenced and decoded the entire amount of stored digital information with no random access, but nowadays it has become possible to extract selective files (e.g., retrieving only required image from a collection) from a DNA pool using PCR-based random access. Various scientists successfully stored up to 110 zettabytes data in one gram of DNA. In the future, with an efficient encoding, error corrections, cheaper DNA synthesis,and sequencing, DNA based storage will become a practical solution for storage of exponentially growing digital data.



2020 ◽  
Author(s):  
Min Hao ◽  
Hongyan Qiao ◽  
Yanmin Gao ◽  
Zhaoguan Wang ◽  
Xin Qiao ◽  
...  

AbstractDNA emerged as novel material for mass data storage, the serious problem human society is facing. Taking advantage of current synthesis capacity, massive oligo pool demonstrated its high-potential in data storage in test tube. Herein, mixed culture of bacterial cells carrying mass oligo pool that was assembled in a high copy plasmid was presented as a stable material for large scale data storage. Living cells data storage was fabricated by a multiple-steps process, assembly, transformation and mixed culture. The underlying principle was explored by deep bioinformatic analysis. Although homology assembly showed sequence context dependent bias but the massive digital information oligos in mixed culture were constant over multiple successive passaging. In pushing the limitation, over ten thousand distinct oligos, totally 2304 Kbps encoding 445 KB digital data including texts and images, were stored in bacterial cell, the largest archival data storage in living cell reported so far. The mixed culture of living cell data storage opens up a new approach to simply bridge the in vitro and in vivo storage system with combined advantage of both storage capability and economical information propagation.



2021 ◽  
Author(s):  
Inbal Preuss ◽  
Zohar Yakhini ◽  
Leon Anavy

Storage needs represent a significant burden on the economy and the environment. Some of this can potentially be offset by improved density molecular storage. The potential of using DNA for storing data is immense. DNA can be harnessed as a high density, durable archiving medium for compressing and storing the exponentially growing quantities of digital data that mankind generates. Several studies have demonstrated the potential of DNA-based data storage systems. These include exploration of different encoding and error correction schemes and the use of different technologies for DNA synthesis and sequencing. Recently, the use of composite DNA letters has been demonstrated to leverage the inherent redundancy in DNA based storage systems to achieve higher logical density, offering a more cost-effective approach. However, the suggested composite DNA approach is still limited due to its sensitivity to the stochastic nature of the process. Combinatorial assembly methods were also suggested to encode information on DNA in high density, while avoding the challenges of the stochastic system. These are based on enzynatic assembly processes for producing the synthetic DNA. In this paper, we propose a novel method to encode information into DNA molecules using combinatorial encoding and shortmer DNA synthesis, in compatibility with current chemical DNA synthesis technologies. Our approach is based on a set of easily distinguishable DNA shortmers serving as building blocks and allowing for near-zero error rates. We construct an extended combinatorial alphabet in which every letter is a subset of the set of building blocks. We suggest different combinatorial encoding schemes and explore their theoretical properties and practical implications in terms of error probabilities and required sequencing depth. To demonstrate the feasibility of our approach, we implemented an end-to-end computer simulation of a DNA-based storage system, using our suggested combinatorial encodings. We use simulations to assess the performance of the system and the effect of different parameters. Our simulations suggest that our combinatorial approach can potentially achieve up to 6.5-fold increase in the logical density over standard DNA based storage systems, with near zero reconstruction error. Implementing our approach at scale to perform actual synthesis, requires minimal alterations to current technologies. Our work thus suggests that the combination of combinatorial encoding with standard DNA chemical synthesis technologies can potentially improve current solutions, achieving scalable, efficient and cost- effective DNA-based storage.



2005 ◽  
Vol 34 (4) ◽  
Author(s):  
Robert Breslawski

With the rapid changes in technology for information creation, capture, display, distribution, storage and preservation, questions abound about the current state of microfilm and its place in the modern information management industry. Clearly there is a place for microfilm in the modern preservation vision. When it comes to information having permanent value, micrographic media remains a stalwart companion of those not willing to risk their data to the perils of digital data storage only. Quoting Jim Harvey of Altek Systems, “Now the word on the street is that without migration, degradation occurs in as little as seven years depending on storage conditions. This is an anathema to archival collections of information … Some are getting ‘that old time religion’ and backing up digital information collections with a permanent micrographic copy.”



2020 ◽  
Vol 6 (50) ◽  
pp. eabc2661
Author(s):  
Chan Cao ◽  
Lucien F. Krapp ◽  
Abdelaziz Al Ouahabi ◽  
Niklas F. König ◽  
Nuria Cirauqui ◽  
...  

Digital data storage is a growing need for our society and finding alternative solutions than those based on silicon or magnetic tapes is a challenge in the era of “big data.” The recent development of polymers that can store information at the molecular level has opened up new opportunities for ultrahigh density data storage, long-term archival, anticounterfeiting systems, and molecular cryptography. However, synthetic informational polymers are so far only deciphered by tandem mass spectrometry. In comparison, nanopore technology can be faster, cheaper, nondestructive and provide detection at the single-molecule level; moreover, it can be massively parallelized and miniaturized in portable devices. Here, we demonstrate the ability of engineered aerolysin nanopores to accurately read, with single-bit resolution, the digital information encoded in tailored informational polymers alone and in mixed samples, without compromising information density. These findings open promising possibilities to develop writing-reading technologies to process digital data using a biological-inspired platform.



2021 ◽  
Vol 11 (13) ◽  
pp. 6070
Author(s):  
Veronika Szücs ◽  
Gábor Arányi ◽  
Ákos Dávid

We live in a world of digital information communication and digital data storage. Following the development of technology, demands from the user side also pose serious challenges for developers, both in the field of hardware and software development. However, the increasing penetration of the Internet, IoT and digital solutions that have become available in almost every segment of life, carries risks as well as benefits. In this study, the authors present the phenomenon of ransomware attacks that appear on a daily basis, which endangers the operation and security of the digital sphere of both small and large enterprises and individuals. An overview of ransomware attacks, the tendency and characteristics of the attacks, which have caused serious financial loss and other damages to the victims, are presented. This manuscript also provides a brief overview of protection against ransomware attacks and the software and hardware options that enhance general user security and their effectiveness as standalone applications. The authors present the results of the study, which aimed to explore how the available software and hardware devices can implement digital user security. Based on the results of the research, the authors propose a complex system that can be used to increase the efficiency of network protection and OS protection tools already available to improve network security, and to detect ransomware attacks early. As a result, the model of the proposed protection system is presented, and it can be stated that the complex system should be able to detect ransomware attacks from either the Internet or the internal network at an early stage, mitigate malicious processes and maintain data in recoverable state.



Author(s):  
Musa Mikailovich Lyanov

This article is dedicated to definition of the term virtual traces in forensics. The author determined the peculiarities of virtual traces, which allowed giving definition to the traces on digital data storage devices. It is noted, the forensic science does not have a universal terminology for definition of traces left as a result of cybercrimes. Solution of this problem remains extremely important, as it would assist the advancement of the theory and practice of forensics in the indicated field. The author analyzed the case law for the purpose of determining peculiarities of utilization of terminology to describe traces detected in the course of studying digital data storage devices. The conclusion is made that most common and suitable term is “virtual traces”. The definition of the concept in questions was proposed leaning on the analysis of peculiarities of virtual traces and mechanism of their emergence on digital data storage devices. Thus, virtual trace is determined as varied in structure, special type of material traces that exists within the limits of digital data storage device and directly tied with it, as well as can be decoded only using special software and technical means.



2019 ◽  
Author(s):  
Yuan-Jyue Chen ◽  
Christopher N. Takahashi ◽  
Lee Organick ◽  
Kendall Stewart ◽  
Siena Dumas Ang ◽  
...  

DNA has recently emerged as an attractive medium for future digital data storage because of its extremely high information density and potential longevity. Recent work has shown promising results in developing proof-of-principle prototype systems. However, very uneven (biased) sequencing coverage distributions have been reported, which indicates inefficiencies in the storage process and points to optimization opportunities. These deviations from the average coverage in oligonucleotide copy distribution result in sequence drop-out and make error-free data retrieval from DNA more challenging. The uneven copy distribution was believed to stem from the underlying molecular processes, but the interplay between these molecular processes and the copy number distribution has been poorly understood until now. In this paper, we use millions of unique sequences from a DNA-based digital data archival system to study the oligonucleotide copy unevenness problem and show that two important sources of bias are the synthesis process and the Polymerase Chain Reaction (PCR) process. By mapping the sequencing coverage of a large complex oligonucleotide pool back to its spatial distribution on the synthesis chip, we find that significant bias comes from array-based oligonucleotide synthesis. We also find that PCR stochasticity is another main driver of oligonucleotide copy variation. Based on these findings, we develop a statistical model for each molecular process as well as the overall process and compare the predicted bias with our experimental data. We further use our model to explore the trade-offs between synthesis bias, storage physical density and sequencing redundancy, providing insights for engineering efficient, robust DNA data storage systems.



Nano Letters ◽  
2020 ◽  
Vol 20 (5) ◽  
pp. 3754-3760 ◽  
Author(s):  
Kaikai Chen ◽  
Jinbo Zhu ◽  
Filip Bošković ◽  
Ulrich F. Keyser
Keyword(s):  


2021 ◽  
Vol 3 (Special Issue ICOST 2S) ◽  
pp. 27-30
Author(s):  
Kavia V


Sign in / Sign up

Export Citation Format

Share Document