HDFS Pipeline Reformation to Minimize the Data Loss

Abstract This paper discusses the physics, definitions, and nanoprobing flow of a flash bit memory. In addition, a case study showing the effectiveness of nanoprobing in detecting the Single Bit Fail Data Gain and Data Loss in Flash Memory is also discussed. The paper also includes cases where no passive voltage contrast was observed at the SEM and no leakage was observed at AFM, yet the units failing SBF DG, SBF DL and depletion, were detected by nanoprobing of the single bit. The major finding of this paper is a way to resolve data gain, data loss, and depletion failures of flash memory by nanoprobing procedure, despite no PVC seen at the SEM and no leakage seen at the AFM.

Download Full-text

Liability for Data Loss

SSRN Electronic Journal ◽

10.2139/ssrn.3237407 ◽

2018 ◽

Cited By ~ 1

Author(s):

Vincenzo Zeno-Zencovich

Keyword(s):

Data Loss

Download Full-text

Increased yields of duplex sequencing data by a series of quality control tools

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab002 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Gundula Povysil ◽

Monika Heinzl ◽

Renato Salazar ◽

Nicholas Stoler ◽

Anton Nekrutenko ◽

...

Keyword(s):

Low Frequency ◽

Variant Calling ◽

Data Loss ◽

Sequencing Data ◽

Bioinformatics Pipeline ◽

Consensus Sequences ◽

Sequencing Errors ◽

Data Output ◽

Reverse Strand ◽

Duplex Sequencing

Abstract Duplex sequencing is currently the most reliable method to identify ultra-low frequency DNA variants by grouping sequence reads derived from the same DNA molecule into families with information on the forward and reverse strand. However, only a small proportion of reads are assembled into duplex consensus sequences (DCS), and reads with potentially valuable information are discarded at different steps of the bioinformatics pipeline, especially reads without a family. We developed a bioinformatics toolset that analyses the tag and family composition with the purpose to understand data loss and implement modifications to maximize the data output for the variant calling. Specifically, our tools show that tags contain polymerase chain reaction and sequencing errors that contribute to data loss and lower DCS yields. Our tools also identified chimeras, which likely reflect barcode collisions. Finally, we also developed a tool that re-examines variant calls from raw reads and provides different summary data that categorizes the confidence level of a variant call by a tier-based system. With this tool, we can include reads without a family and check the reliability of the call, that increases substantially the sequencing depth for variant calling, a particular important advantage for low-input samples or low-coverage regions.

Download Full-text

Robust Secret Image Sharing Resistant to Noise in Shares

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3419750 ◽

2021 ◽

Vol 17 (1) ◽

pp. 1-22

Author(s):

Xuehu Yan ◽

Lintao Liu ◽

Longlong Li ◽

Yuliang Lu

Keyword(s):

Chinese Remainder Theorem ◽

Recovery Phase ◽

Error Correcting Codes ◽

Secret Image Sharing ◽

Data Loss ◽

Least Significant Bit ◽

Recovery Method ◽

Secret Image ◽

Image Sharing ◽

Salt And Pepper

A secret image is split into shares in the generation phase of secret image sharing (SIS) for a threshold. In the recovery phase, the secret image is recovered when any or more shares are collected, and each collected share is generally assumed to be lossless in conventional SIS during storage and transmission. However, noise will arise during real-world storage and transmission; thus, shares will experience data loss, which will also lead to data loss in the secret image being recovered. Secret image recovery in the case of lossy shares is an important issue that must be addressed in practice, which is the overall subject of this article. An SIS scheme that can recover the secret image from lossy shares is proposed in this article. First, robust SIS and its definition are introduced. Next, a robust SIS scheme for a threshold without pixel expansion is proposed based on the Chinese remainder theorem (CRT) and error-correcting codes (ECC). By screening the random numbers, the share generation phase of the proposed robust SIS is designed to implement the error correction capability without increasing the share size. Particularly in the case of collecting noisy shares, our recovery method is to some degree robust to some noise types, such as least significant bit (LSB) noise, JPEG compression, and salt-and-pepper noise. A theoretical proof is presented, and experimental results are examined to evaluate the effectiveness of our proposed method.

Download Full-text

Semi‐automated background removal limits data loss and normalises imaging mass cytometry data

Cytometry Part A ◽

10.1002/cyto.a.24480 ◽

2021 ◽

Author(s):

Marieke E. Ijsselsteijn ◽

Antonios Somarakis ◽

Boudewijn P. F. Lelieveldt ◽

Thomas Höllt ◽

Noel F. C. C. de Miranda

Keyword(s):

Data Loss ◽

Mass Cytometry ◽

Background Removal

Download Full-text

MobiGyges: A mobile hidden volume for preventing data loss, improving storage utilization, and avoiding device reboot

Future Generation Computer Systems ◽

10.1016/j.future.2020.03.048 ◽

2020 ◽

Vol 109 ◽

pp. 158-171

Author(s):

Wendi Feng ◽

Chuanchang Liu ◽

Zehua Guo ◽

Thar Baker ◽

Gang Wang ◽

...

Keyword(s):

Data Loss

Download Full-text

US pays dear for data loss

Computer Fraud & Security Bulletin ◽

10.1016/s0142-0496(09)90060-1 ◽

1993 ◽

Vol 1993 (2) ◽

pp. 2

Keyword(s):

Data Loss

Download Full-text

Tree-Structured Parallel Regeneration Based on Regenerating Codes for Multiple Data Losses in Distributed Storage Systems

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.918.295 ◽

2014 ◽

Vol 918 ◽

pp. 295-300

Author(s):

Peng Fei You ◽

Yu Xing Peng ◽

Zhen Huang ◽

Chang Jian Wang

Keyword(s):

Storage Systems ◽

Distributed Storage ◽

Data Loss ◽

Data Reliability ◽

Data Redundancy ◽

Regeneration Time ◽

Multiple Data ◽

Distributed Storage Systems ◽

Regenerating Codes ◽

Reliability And Availability

In distributed storage systems, erasure codes represent an attractive data redundancy solution which can provide the same reliability as replication requiring much less storage space. Multiple data losses happens usually and the lost data should be regenerated to maintain data redundancy in distributed storage systems. Regeneration for multiple data losses is expected to be finished as soon as possible, because the regeneration time can influence the data reliability and availability of distributed storage systems. However, multiple data losses is usually regenerated by regenerating single data loss one by one, which brings high entire regeneration time and severely reduces the data reliability and availability of distributed storage systems. In this paper, we propose a tree-structured parallel regeneration scheme based on regenerating codes (TPRORC) for multiple data losses in distributed storage systems. In our scheme, multiple regeneration trees based on regenerating code are constructed. Firstly, these trees are created independently, each of which dose not share any edges from the others and is responsible for one data loss; secondly, every regeneration tree based on regenerating codes owns the least network traffic and bandwidth optimized-paths for regenerating its data loss. Thus it can perform parallel regeneration for multiple data losses by using multiple optimized topology trees, in which network bandwidth is utilized efficiently and entire regeneration is overlapped. Our simulation results show that the tree-structured parallel regeneration scheme reduces the regeneration time significantly, compared to other regular regeneration schemes.

Download Full-text