Modified Raptor Code for Distributed Storage Systems

2010 ◽  
Vol 20-23 ◽  
pp. 52-57
Author(s):  
Zheng Chen ◽  
Xiao Jing Wang

We provide a prototype system for a distributed storage model, the goal of this system is to store information in a p2p network of n nodes, and the original information could be recovered later in a computationally simple way from (1+e)k of nodes for some small e>0. For solving this problem, we employ a class of B-J codes with dimension 2 as base code and obtain a new class of LDPC code by using q-tuples to substitute the elements of , to modify and improve Raptor code. The improved Raptor Code has many advantages such as high decoding rate, flexibility in choices of the code length and rate of. As the benefits of the above characters, the storage system based on modified Raptor code has great improvement in decoding probability and parameter flexibility, and a simulation is also illustrated to prove our assumption.

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Xiangli Chang ◽  
Hailang Cui

With the increasing popularity of a large number of Internet-based services and a large number of services hosted on cloud platforms, a more powerful back-end storage system is needed to support these services. At present, it is very difficult or impossible to implement a distributed storage to meet all the above assumptions. Therefore, the focus of research is to limit different characteristics to design different distributed storage solutions to meet different usage scenarios. Economic big data should have the basic requirements of high storage efficiency and fast retrieval speed. The large number of small files and the diversity of file types make the storage and retrieval of economic big data face severe challenges. This paper is oriented to the application requirements of cross-modal analysis of economic big data. According to the source and characteristics of economic big data, the data types are analyzed and the database storage architecture and data storage structure of economic big data are designed. Taking into account the spatial, temporal, and semantic characteristics of economic big data, this paper proposes a unified coding method based on the spatiotemporal data multilevel division strategy combined with Geohash and Hilbert and spatiotemporal semantic constraints. A prototype system was constructed based on Mongo DB, and the performance of the multilevel partition algorithm proposed in this paper was verified by the prototype system based on the realization of data storage management functions. The Wiener distributed memory based on the principle of Wiener filter is used to store the workload of each workload distributed storage window in a distributed manner. For distributed storage workloads, this article adopts specific types of workloads. According to its periodicity, the workload is divided into distributed storage windows of specific duration. At the beginning of each distributed storage window, distributed storage is distributed to the next distributed storage window. Experiments and tests have verified the distributed storage strategy proposed in this article, which proves that the Wiener distributed storage solution can save platform resources and configuration costs while ensuring Service Level Agreement (SLA).


2014 ◽  
Vol 513-517 ◽  
pp. 1046-1051
Author(s):  
Yong Chuan Li ◽  
Yu Xing Peng ◽  
Hui Ba Li

With the rapid development of cloud computing, there are many storage structures have been proposed for satisfying cloud-based softwares requirements. Most existing distributed storage systems focus on a certain objective and only provide a certain storage structure. In this paper we present a novel block-level distributed storage system named Flex which integrates storage resource dispersed on the network into a whole one. Flex uses a device mapping framework to create dynamic and flexible storage structures for users. We have implemented the prototype and evaluated its performance; results show that Flex can provide a high performance in diverse storage structures.


2016 ◽  
Vol 4 (1) ◽  
Author(s):  
Agus Maman Abadi ◽  
Musthofa Musthofa ◽  
Emut Emut

The increasing need in techniques of storing big data presents a new challenge. One way to address this challenge is the use of distributed storage systems. One strategy that implemented in distributed data storage systems is the use of Erasure Code which applied to network coding. The code used in this technique is based on the algebraic structure which is called as vector space. Some studies have also been carried out to create code that is based on other algebraic structures such as module.  In this study, we are going to try to set up a code based on the algebraic structure which is a generalization of the module that is semimodule by utilizing the max operations and sum operations at max plus algebra. The results of this study indicate that the max operation and the addition operation on max plus algebra cannot be used to establish a semimodule code, but by modifying the operation "+" as "min", we get a code based on semimodule. Keywords:   code, distributed storage systems, network coding, semimodule, max plus algebra


Author(s):  
Rossano Gaeta ◽  
Marco Grangetto

In coding-based distributed storage systems (DSSs), a set of storage nodes (SNs) hold coded fragments of a data unit that collectively allow one to recover the original information. It is well known that data modification (a.k.a. pollution attack) is the Achilles’ heel of such coding systems; indeed, intentional modification of a single coded fragment has the potential to prevent the reconstruction of the original information because of error propagation induced by the decoding algorithm. The challenge we take in this work is to devise an algorithm to identify polluted coded fragments within the set encoding a data unit and to characterize its performance. To this end, we provide the following contributions: (i) We devise MIND (Malicious node IdeNtification in DSS), an algorithm that is general with respect to the encoding mechanism chosen for the DSS, it is able to cope with a heterogeneous allocation of coded fragments to SNs, and it is effective in successfully identifying polluted coded fragments in a low-redundancy scenario; (ii) We formally prove both MIND termination and correctness; (iii) We derive an accurate analytical characterization of MIND performance (hit probability and complexity); (iv) We develop a C++ prototype that implements MIND to validate the performance predictions of the analytical model. Finally, to show applicability of our work, we define performance and robustness metrics for an allocation of coded fragments to SNs and we apply the results of the analytical characterization of MIND performance to select coded fragments allocations yielding robustness to collusion as well as the highest probability to identify actual attackers.


2019 ◽  
Vol 214 ◽  
pp. 05008 ◽  
Author(s):  
Jozsef Makai ◽  
Andreas Joachim Peters ◽  
Georgios Bitzes ◽  
Elvin Alin Sindrilaru ◽  
Michal Kamil Simon ◽  
...  

Complex, large-scale distributed systems are frequently used to solve extraordinary computing, storage and other problems. However, the development of these systems usually requires working with several software components, maintaining and improving a large codebase and also providing a collaborative environment for many developers working together. The central role that such complex systems play in mission critical tasks and also in the daily activity of the users means that any software bug affecting the availability of the service has far reaching effects. Providing an easily extensible testing framework is a pre-requisite for building both confidence in the system but also among developers who contribute to the code. The testing framework can address concrete bugs found in the odebase thus avoiding any future regressions and also provides a high degree of confidence for the people contributing new code. Easily incorporating other people's work into the project greatly helps scaling out manpower so that having more developers contributing to the project can actually result in more work being done rather then more bugs added. In this paper we go through the case study of EOS, the CERN disk storage system and introduce the methods and mechanisms of how to achieve all-automatic regression and robustness testing along with continuous integration for such a large-scale, complex and critical system using a container-based environment.


2018 ◽  
Vol 7 (2.8) ◽  
pp. 379
Author(s):  
D Sowmia ◽  
B Muruganantham

Distributed storage systems give dependable access to information through excess spread over independently unreliable hubs. Application scenarios incorporate server farms, distributed capacity frameworks, and capacity in remote systems. This paper gives a study on the cloud storage model of networked online storage where data is stored in virtualized pools of storage which are generally hosted by third parties. Hosting companies operate large data centersand people who require their data to be encouraged buy or lease accumulating limit from them. The server cultivate overseers, outside of anyone's ability to see, virtualize the advantages according to the necessities of the customer and reveal them as limit pools, which the customers would themselves have the capacity to use to store records or data objects. . The data is stored across various locations, when the user wants to retrieve them, it could be done by any of the encryption methods. At last, in view of existing procedures, promising future research bearings are recommended.


2017 ◽  
Vol 45 (1) ◽  
pp. 51-51
Author(s):  
Wen Sun ◽  
Véronique Simon ◽  
Sébastien Monnet ◽  
Philippe Robert ◽  
Pierre Sens

Sign in / Sign up

Export Citation Format

Share Document