scholarly journals Damming the genomic data flood using a comprehensive analysis and storage data structure

Database ◽  
2010 ◽  
Vol 2010 ◽  
Author(s):  
Marc Bouffard ◽  
Michael S. Phillips ◽  
Andrew M.K. Brown ◽  
Sharon Marsh ◽  
Jean-Claude Tardif ◽  
...  
2020 ◽  
Vol 23 (4) ◽  
pp. 627-638
Author(s):  
Daniel A. Hescheler ◽  
Patrick S. Plum ◽  
Thomas Zander ◽  
Alexander Quaas ◽  
Michael Korenkov ◽  
...  

2020 ◽  
Vol 39 (4) ◽  
pp. 5027-5036
Author(s):  
You Lu ◽  
Qiming Fu ◽  
Xuefeng Xi ◽  
Zhenping Chen

Data outsourcing has gradually become a mainstream solution, but once data is outsourced, data owners will without the control of the data hardware, there is a possibility that the integrity of the data will be destroyed objectively. Many current studies have achieved low network overhead cloud data set verification by designing algorithmic structures (e.g., hashing, Merkel verification trees); however, cloud service providers may not recognize the incompleteness of cloud data to avoid liability or business factors fact. There is a need to build a secure, reliable, non-tamperable, and non-forgeable verification system for accountability. Blockchain is a chain-like data structure constructed by using data signatures, timestamps, hash functions, and proof-of-work mechanisms. Using blockchain technology to build an integrity verification system can achieve fault accountability. Blockchain is a chain-like data structure constructed by using data signatures, timestamps, hash functions, and proof-of-work mechanisms. Using blockchain technology to build an integrity verification system can achieve fault accountability. This paper uses the Hadoop framework to implement data collection and storage of the HBase system based on big data architecture. In summary, based on the research of blockchain cloud data collection and storage technology, based on the existing big data storage middleware, a large flow, high concurrency and high availability data collection and processing system has been realized.


2019 ◽  
Vol 35 (23) ◽  
pp. 4907-4911 ◽  
Author(s):  
Jianglin Feng ◽  
Aakrosh Ratan ◽  
Nathan C Sheffield

Abstract Motivation Genomic data is frequently stored as segments or intervals. Because this data type is so common, interval-based comparisons are fundamental to genomic analysis. As the volume of available genomic data grows, developing efficient and scalable methods for searching interval data is necessary. Results We present a new data structure, the Augmented Interval List (AIList), to enumerate intersections between a query interval q and an interval set R. An AIList is constructed by first sorting R as a list by the interval start coordinate, then decomposing it into a few approximately flattened components (sublists), and then augmenting each sublist with the running maximum interval end. The query time for AIList is O(log2N+n+m), where n is the number of overlaps between R and q, N is the number of intervals in the set R and m is the average number of extra comparisons required to find the n overlaps. Tested on real genomic interval datasets, AIList code runs 5–18 times faster than standard high-performance code based on augmented interval-trees, nested containment lists or R-trees (BEDTools). For large datasets, the memory-usage for AIList is 4–60% of other methods. The AIList data structure, therefore, provides a significantly improved fundamental operation for highly scalable genomic data analysis. Availability and implementation An implementation of the AIList data structure with both construction and search algorithms is available at http://ailist.databio.org. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Mohd Aliff Faiz Jeffry ◽  
Hazinah Kutty Mammi

Digital watermarking technique is a way of protecting digital image from malicious attacks. Compression attack is one of the most common attacks for images uploaded into social media. Social media, such as Facebook and Twitter, implement compression method for all types of media, before it is successfully uploaded into their server. This is to reduce the network bandwidth and storage needed to store each media in their server. However, the implemented compression method tends to tarnish image properties from the image itself, which can be used to identify the image itself. This produces other problems, which are ownership and copyright issues. Digital watermark has been proposed in numerous researches, and this research is one of them, in preventing the stated problem. The chosen digital watermarking techniques must be able to withstand against compression attack done by social media. A comprehensive analysis towards the watermarking algorithms and watermarked images were done, by applying several designed experiments. Based on the results, it shows that both chosen watermarking techniques could not withstands against compression attack made by JPEG compression and social media compression. It indicates that watermarking technique was not a suitable method to be used in preserving the ownership and copyright of the image throughout social media.


1969 ◽  
Vol 9 (3) ◽  
pp. 270-282 ◽  
Author(s):  
P. L. Wodon

2017 ◽  
Vol 60 (2) ◽  
Author(s):  
Izumi C. Mori ◽  
Yoko Ikeda ◽  
Takakazu Matsuura ◽  
Takashi Hirayama ◽  
Koji Mikami

AbstractEmerging studies suggest that seaweeds contain phytohormones; however, their chemical entities, biosynthetic pathways, signal transduction mechanisms, and physiological roles are poorly understood. Until recently, it was difficult to conduct comprehensive analysis of phytohormones in seaweeds because of the interfering effects of cellular constituents on fine quantification. In this review, we discuss the details of the latest method allowing simultaneous profiling of multiple phytohormones in red seaweeds, while avoiding the effects of cellular factors. Recent studies have confirmed the presence of indole-3-acetic acid (IAA),


2018 ◽  
Vol 16 (05) ◽  
pp. 1850018 ◽  
Author(s):  
Sanjeev Kumar ◽  
Suneeta Agarwal ◽  
Ranvijay

Genomic data nowadays is playing a vital role in number of fields such as personalized medicine, forensic, drug discovery, sequence alignment and agriculture, etc. With the advancements and reduction in the cost of next-generation sequencing (NGS) technology, these data are growing exponentially. NGS data are being generated more rapidly than they could be significantly analyzed. Thus, there is much scope for developing novel data compression algorithms to facilitate data analysis along with data transfer and storage directly. An innovative compression technique is proposed here to address the problem of transmission and storage of large NGS data. This paper presents a lossless non-reference-based FastQ file compression approach, segregating the data into three different streams and then applying appropriate and efficient compression algorithms on each. Experiments show that the proposed approach (WBFQC) outperforms other state-of-the-art approaches for compressing NGS data in terms of compression ratio (CR), and compression and decompression time. It also has random access capability over compressed genomic data. An open source FastQ compression tool is also provided here ( http://www.algorithm-skg.com/wbfqc/home.html ).


Sign in / Sign up

Export Citation Format

Share Document