scholarly journals Influence of different factors on the RAID 0 paired magnetic disk arrays

Author(s):  
Nikola Davidović ◽  
Slobodan Obradović ◽  
Borislav Đorđević ◽  
Valentina Timčenko ◽  
Bojan Škorić

The rapid technological progress has led to a growing need for more data storage space. The appearance of big data requires larger storage space, faster access and exchange of data as well as data security. RAID (Redundant Array of Independent Disks) technology is one of the most cost-effective ways to satisfy needs for larger storage space, data access and protection. However, the connection of multiple secondary memory devices in RAID 0 aims to improve the secondary memory system in a way to provide greater storage capacity, increase both read data speed and write data speed but it is not fault-tolerant or error-free. This paper provides an analysis of the system for storing the data on the paired arrays of magnetic disks in a RAID 0 formation, with different number of queue entries for overlapped I/O, where queue depth parameter has the value of 1 and 4. The paper presents a range of test results and analysis for RAID 0 series for defined workload characteristics. The tests were carried on in Microsoft Windows Server 2008 R2 Standard operating system, using 2, 3, 4 and 6 paired magnetic disks and controlled by Dell PERC 6/i hardware RAID controller. For the needs of obtaining the measurement results, ATTO Disk Benchmark has been used. The obtained results have been analyzed and compared to the expected behavior.

2021 ◽  
pp. 1063293X2199201
Author(s):  
Anto Praveena M.D. ◽  
Bharathi B

Duplication of data in an application will become an expensive factor. These replication of data need to be checked and if it is needed it has to be removed from the dataset as it occupies huge volume of data in the storage space. The cloud is the main source of data storage and all organizations are already started to move their dataset into the cloud since it is cost effective, storage space, data security and data Privacy. In the healthcare sector, storing the duplicated records leads to wrong prediction. Also uploading same files by many users, data storage demand will be occurred. To address those issues, this paper proposes an Optimal Removal of Deduplication (ORD) in heart disease data using hybrid trust based neural network algorithm. In ORD scheme, the Chaotic Whale Optimization (CWO) algorithm is used for trust computation of data using multiple decision metrics. The computed trust values and the nature of the data’s are sequentially applied to the training process by the Mimic Deep Neural Network (MDNN). It classify the data is a duplicate or not. Hence the duplicates files are identified and they were removed from the data storage. Finally, the simulation evaluates to examine the proposed MDNN based model and simulation results show the effectiveness of ORD scheme in terms of data duplication removal. From the simulation result it is found that the model’s accuracy, sensitivity and specificity was good.


Author(s):  
Bharat Tidke ◽  
Rupa Mehta ◽  
Dipti Rana ◽  
Hullash Jangir

Social media data (SMD) is driven by statistical and analytical technologies to obtain information for various decisions. SMD is vast and evolutionary in nature which makes traditional data warehouses ill suited. The research aims to propose and implement novel framework that analyze tweets data from online social networking site (OSN; i.e., Twitter). The authors fetch streaming tweets from Twitter API using Apache Flume to detect clusters of users having similar sentiment. Proposed approach utilizes scalable and fault tolerant system (i.e., Hadoop) that typically harness HDFS for data storage and map-reduce paradigm for data processing. Apache Hive is used to work on top of Hadoop for querying data. The experiments are performed to test the scalability of proposed framework by examining various sizes of data. The authors' goal is to handle big social data effectively using cost-effective tools for fetching as well as querying unstructured data and algorithms for analysing scalable, uninterrupted data streams with finite memory and resources.


Author(s):  
Jan Stender ◽  
Michael Berlin ◽  
Alexander Reinefeld

Cloud computing poses new challenges to data storage. While cloud providers use shared distributed hardware, which is inherently unreliable and insecure, cloud users expect their data to be safely and securely stored, available at any time, and accessible in the same way as their locally stored data. In this chapter, the authors present XtreemFS, a file system for the cloud. XtreemFS reconciles the need of cloud providers for cheap scale-out storage solutions with that of cloud users for a reliable, secure, and easy data access. The main contributions of the chapter are: a description of the internal architecture of XtreemFS, which presents an approach to build large-scale distributed POSIX-compliant file systems on top of cheap, off-the-shelf hardware; a description of the XtreemFS security infrastructure, which guarantees an isolation of individual users despite shared and insecure storage and network resources; a comprehensive overview of replication mechanisms in XtreemFS, which guarantee consistency, availability, and durability of data in the face of component failures; an overview of the snapshot infrastructure of XtreemFS, which allows to capture and freeze momentary states of the file system in a scalable and fault-tolerant fashion. The authors also compare XtreemFS with existing solutions and argue for its practicability and potential in the cloud storage market.


2020 ◽  
Vol 245 ◽  
pp. 04008
Author(s):  
Andreas-Joachim Peters ◽  
Michal Kamil Simon ◽  
Elvin Alin Sindrilaru

The storage group of CERN IT operates more than 20 individual EOS[1] storage services with a raw data storage volume of more than 340 PB. Storage space is a major cost factor in HEP computing and the planned future LHC Run 3 and 4 increase storage space demands by at least an order of magnitude. A cost effective storage model providing durability is Erasure Coding (EC) [2]. The decommissioning of CERN’s remote computer center (Wigner/Budapest) allows a reconsideration of the currently configured dual-replica strategy where EOS provides one replica in each computer center. EOS allows one to configure EC on a per file bases and exposes four different redundancy levels with single, dual, triple and fourfold parity to select different quality of service and variable costs. This paper will highlight tests which have been performed to migrate files on a production instance from dual-replica to various EC profiles. It will discuss performance and operational impact, and highlight various policy scenarios to select the best file layout with respect to IO patterns, file age and file size. We will conclude with the current status and future optimizations, an evaluation of cost savings and discuss an erasure encoded EOS setup as a possible tape storage replacement.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Kyle J. Tomek ◽  
Kevin Volkel ◽  
Elaine W. Indermaur ◽  
James M. Tuck ◽  
Albert J. Keung

AbstractDNA holds significant promise as a data storage medium due to its density, longevity, and resource and energy conservation. These advantages arise from the inherent biomolecular structure of DNA which differentiates it from conventional storage media. The unique molecular architecture of DNA storage also prompts important discussions on how data should be organized, accessed, and manipulated and what practical functionalities may be possible. Here we leverage thermodynamic tuning of biomolecular interactions to implement useful data access and organizational features. Specific sets of environmental conditions including distinct DNA concentrations and temperatures were screened for their ability to switchably access either all DNA strands encoding full image files from a GB-sized background database or subsets of those strands encoding low resolution, File Preview, versions. We demonstrate File Preview with four JPEG images and provide an argument for the substantial and practical economic benefit of this generalizable strategy to organize data.


Author(s):  
Umakanta Mahanta ◽  
Bhabesh Chandra Mohanta ◽  
Anup Kumar Panda ◽  
Bibhu Prasad Panigrahi

Torque ripple reduction is one of the major challenges in switching table-based direct torque control (DTC) while operating for open phase faults of an induction motor, as the switching vectors are unevenly distributed. This can be minimized by increasing the level of the inverter and with the use of multi-phase motors. Fuzzy logic-based DTC is another solution to the above problem. In this paper, a comparative analysis is done between switching table-based DTC (ST-DTC) and fuzzy logic-based DTC for increasing the performance during open phase faults of a five-phase induction motor. The result shows that in fuzzy logic-based DTC with a two-level inverter, the torque ripple is reduced by 5.164% as compared with ST-DTC with a three-level inverter. The fuzzy logic-based DTC with the three-level inverter also gives better performance as compared with fuzzy logic-based DTC with the two-level inverter. The current ripple also reduced by 9.605% with respect to ST-DTC. Thus, fuzzy logic-based DTC is more suitable and cost effective for open phase fault-tolerant drives.


Sign in / Sign up

Export Citation Format

Share Document