disk storage
Recently Published Documents


TOTAL DOCUMENTS

252
(FIVE YEARS 18)

H-INDEX

18
(FIVE YEARS 1)

Author(s):  
Pardeep Mehta ◽  

Cloud computing—delivering infrastructure, services, and software on demand via the network—offers attractive advantages to the public sector. It has the potential to reduce information and communications technology (ICT) costs by virtualizing capital assets like disk storage and processing cycles into a readily available, affordable operating expense. Unfortunately, architects more often than not assume that simply adding another server into the mix can fix any performance problem and security issues. Cloud is a platform shuffle that enables a fierce and contentious debate on the issues of security and performance surrounding how to secure information and instantiate trust in an increasingly open and assumed-hostile web operating environment. When you start adding new hardware/update existing hardware in a web cloud, the complexity starts increasing which affects performance and hence security. Here we will define the algorithms to keep both performance and data secure but flexible enough to allow for expandability. In this paper, we have highlighted the following critical issues for the leading Cloud Computing such as Characteristics of cloud computing, threats and performance analysis. This paper specifically discuss about the below-mentioned in detail.


Author(s):  
Hui Li ◽  
Jianwei Liao ◽  
Xiaoyan Liu

I/O merging optimization at the block I/O layer of disk storage is widely adopted to reduce I/O response time. But it may result in certain overhead of merging judgment in the case of a large number of concurrent I/O requests accessing disk storage, and place negative effects on the response of small requests. This paper proposes a divide and conquer scheduling scheme at the block layer of I/O stack, to satisfy a large number of concurrent I/O requests with less I/O response time and ensure the fairness of each request response by decreasing the average I/O latency. First, we propose a horizontal visibility graph-based approach to cluster relevant block requests, according to their offsets (i.e., logic block numbers). Next, it carries out the optimization operation of merging consecutive block I/O requests within each cluster, as only these requests in the same cluster are most likely to be issued by a specific application. Then, we have introduced the functionality of merging judgment when performing merging optimization to effectively guarantee the average I/O response time. After that, the merged requests in the queue will be reordered on the basis of their priorities, to purposely cut down the average I/O response time. Finally, the prioritized requests are supposed to be delivered to the disk storage, for being serviced. Through a series of experiments, we show that compared to the benchmark, the newly proposed scheme can not only cut down the I/O response time by more than 18.2%, but also decrease the average I/O response time up to 71.7%.


Author(s):  
Anubhav Pandit

Data deduplication is necessary for corporations to minimize the hidden charges linked with backing up their data using Public Cloud platform. Incapable data storage on its own can become improvident, and such problem are enlarging in the Public Cloud at other and scattered satisfied confirmed storage structure are creating multiple clone of single account for collating or other purposes. Deduplication is friendly in cost shrinking by lengthening the benefit of a precise volume of data. Miserably, data duplicity having several safety constraints, so more than one encoding is appropriate to validate the details. There is a system for dynamic Information-Locking and Encoding with Convergent Encoding. In this Information-Locking and Encoding with Convergent Encoding, the data is coded first and then the cipher text is encoded once more. Chunk volume is used for deduplication to diminish disk capacity. The same segments would still be encrypted in the same cipher message. The format of the key neither be abbreviated from encrypted chunk data by the hacker. The comprehension is also guarded from the cloud server. The center of attention of this document is to reducing disk storage and provides protection for online cloud deduplication.


2021 ◽  
Vol 251 ◽  
pp. 02016
Author(s):  
Lea Morschel ◽  
Krishnaveni Chitrapu ◽  
Vincent Garonne ◽  
Dmitry Litvintsev ◽  
Svenja Meyer ◽  
...  

Given the anticipated increase in the amount of scientific data, it is widely accepted that primarily disk based storage will become prohibitively expensive. Tape based storage, on the other hand, provides a viable and affordable solution for the ever increasing demand for storage space. Coupled with a disk caching layer that temporarily holds a small fraction of the total data volume to allow for low latency access, it turns tape based systems into active archival storage (write once, read many) that imposes additional demands on data flow optimization compared to traditional backup setups (write once, read never). In order to preserve the lifetime of tapes and minimize the inherently higher access latency, different tape usage strategies are being evaluated. As an important disk storage system for scientific data that transparently handles tape access, dCache is making efforts to evaluate its recall optimization potential and is introducing a proof-of-concept, high-level stage request scheduling component within its SRM implementation.


BMC Genomics ◽  
2020 ◽  
Vol 21 (S10) ◽  
Author(s):  
Tanveer Ahmad ◽  
Nauman Ahmed ◽  
Zaid Al-Ars ◽  
H. Peter Hofstee

Abstract Background Immense improvements in sequencing technologies enable producing large amounts of high throughput and cost effective next-generation sequencing (NGS) data. This data needs to be processed efficiently for further downstream analyses. Computing systems need this large amounts of data closer to the processor (with low latency) for fast and efficient processing. However, existing workflows depend heavily on disk storage and access, to process this data incurs huge disk I/O overheads. Previously, due to the cost, volatility and other physical constraints of DRAM memory, it was not feasible to place large amounts of working data sets in memory. However, recent developments in storage-class memory and non-volatile memory technologies have enabled computing systems to place huge data in memory to process it directly from memory to avoid disk I/O bottlenecks. To exploit the benefits of such memory systems efficiently, proper formatted data placement in memory and its high throughput access is necessary by avoiding (de)-serialization and copy overheads in between processes. For this purpose, we use the newly developed Apache Arrow, a cross-language development framework that provides language-independent columnar in-memory data format for efficient in-memory big data analytics. This allows genomics applications developed in different programming languages to communicate in-memory without having to access disk storage and avoiding (de)-serialization and copy overheads. Implementation We integrate Apache Arrow in-memory based Sequence Alignment/Map (SAM) format and its shared memory objects store library in widely used genomics high throughput data processing applications like BWA-MEM, Picard and GATK to allow in-memory communication between these applications. In addition, this also allows us to exploit the cache locality of tabular data and parallel processing capabilities through shared memory objects. Results Our implementation shows that adopting in-memory SAM representation in genomics high throughput data processing applications results in better system resource utilization, low number of memory accesses due to high cache locality exploitation and parallel scalability due to shared memory objects. Our implementation focuses on the GATK best practices recommended workflows for germline analysis on whole genome sequencing (WGS) and whole exome sequencing (WES) data sets. We compare a number of existing in-memory data placing and sharing techniques like ramDisk and Unix pipes to show how columnar in-memory data representation outperforms both. We achieve a speedup of 4.85x and 4.76x for WGS and WES data, respectively, in overall execution time of variant calling workflows. Similarly, a speedup of 1.45x and 1.27x for these data sets, respectively, is achieved, as compared to the second fastest workflow. In some individual tools, particularly in sorting, duplicates removal and base quality score recalibration the speedup is even more promising. Availability The code and scripts used in our experiments are available in both container and repository form at: https://github.com/abs-tudelft/ArrowSAM.


Geophysics ◽  
2020 ◽  
Vol 85 (3) ◽  
pp. S151-S167
Author(s):  
Zabihollah Khaksar ◽  
George A. McMechan

A 2D algorithm for angle-domain common-image gather (CIG) calculation is extended and modified to produce 3D elastic angle and azimuth CIGs. The elastic seismic data are propagated with the elastic particle displacement wave equation, and then the PP-reflected and PS-converted waves are separated by divergence and curl calculations during application of the excitation-time imaging condition. The incident angles and azimuths are calculated using source propagation directions and the reflector normals. The source propagation direction vector is computed as the spatial gradient of the incident 3C P-wavefield. The vector normal to the reflector is calculated using the Hilbert transform. Ordering the migrated images with respect to incident angles for a fixed azimuth bin, or with respect to azimuths for a fixed incident angle bin, creates angle- or azimuth-domain CIGs, respectively. Sorting the azimuth gathers by the incident angle bins causes a shift to a greater depth for too-high migration velocity and to a smaller depth for too-low migration velocity. For the sorted incident angle gathers, the velocity-dependent depth moveout is within the angle gathers and across the azimuth gathers. This method is compared with three other 3D CIG algorithms with respect to the number of calculations and their disk storage and RAM requirements; it is three to six orders of magnitude faster and requires two to three orders of magnitude less disk space. The method is successfully tested with data for a modified part of the SEG/EAGE overthrust model.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2159 ◽  
Author(s):  
Sung Hoon Baek ◽  
Ki-Woong Park

Flash-based storage is considered to be a de facto storage module for sustainable Internet of things (IoT) platforms under a harsh environment due to its relatively fast speed and operational stability compared to disk storage. Although their performance is considerably faster than disk-based mechanical storage devices, the read and write latency still could not catch up with that of Random-access memory (RAM). Therefore, RAM could be used as storage devices or systems for time-critical IoT applications. Despite such advantages of RAM, a RAM-based storage system has limitations in its use for sustainable IoT devices due to its nature of volatile storage. As a remedy to this problem, this paper presents a durable hybrid RAM disk enhanced with a new read interface. The proposed durable hybrid RAM disk is designed for sustainable IoT devices that require not only high read/write performance but also data durability. It includes two performance improvement schemes: rapid resilience with a fast initialization and direct byte read (DBR). The rapid resilience with a fast initialization shortens the long booting time required to initialize the durable hybrid RAM disk. The new read interface, DBR, enables the durable hybrid RAM disk to bypass the disk cache, which is an overhead in RAM-based storages. DBR performs byte–range I/O, whereas direct I/O requires block-range I/O; therefore, it provides a more efficient interface than direct I/O. The presented schemes and device were implemented in the Linux kernel. Experimental evaluations were performed using various benchmarks at the block level till the file level. In workloads where reads and writes were mixed, the durable hybrid RAM disk showed 15 times better performance than that of Solid-state drive (SSD) itself.


Sign in / Sign up

Export Citation Format

Share Document