Storage Resource Broker; Generic Software Infrastructure for Managing Globally Distributed Data

Author(s):  
R.W. Moore ◽  
M. Wan ◽  
A. Rajasekar
Information ◽  
2018 ◽  
Vol 9 (11) ◽  
pp. 286 ◽  
Author(s):  
Yonggen Gu ◽  
Dingding Hou ◽  
Xiaohong Wu ◽  
Jie Tao ◽  
Yanqiong Zhang

Distributed data storage has received more attention due to its advantages in reliability, availability and scalability, and it brings both opportunities and challenges for distributed data storage transaction. The traditional transaction system of storage resources, which generally runs in a centralized mode, results in high cost, vendor lock-in and single point failure risk. To overcome the above shortcomings, considering the storage policy with erasure coding, in this paper we propose a decentralized transaction method for cloud storage based on a smart contract, which takes into account the resource cost for distributed data storage. First, to guarantee the availability and decrease the storing cost, a reverse Vickrey-Clarke-Groves (VCG) based auction mechanism is proposed for storage resource selection and transaction. Then we deploy and implement the proposed mechanism by designing a corresponding smart contract. Especially, we address the problem of how to implement a VCG-like mechanism in a blockchain environment. Based on the private chain of Ethereum, we make the simulation for the proposed storage transaction method. The results of simulation show that the proposed transaction model can realize competitive trading of storage resources and ensure the safe and economic operation of resource trading.


2021 ◽  
pp. 157-165
Author(s):  
Anatoliy Gorbenko ◽  
Andrii Karpenko ◽  
Olga Tarasyuk

A concept of distributed replicated NoSQL data storages Cassandra-like, HBase, MongoDB has been proposed to effectively manage Big Data set whose volume, velocity and variability are difficult to deal with by using the traditional Relational Database Management Systems. Tradeoffs between consistency, availability, partition tolerance and latency is intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP and PACELC theorems in qualitative terms, it is still necessary to quantify how different consistency settings, deployment patterns and other properties affect system performance.This experience report analysis performance of the Cassandra NoSQL database cluster and studies the tradeoff between data consistency guaranties and performance in distributed data storages. The primary focus is on investigating the quantitative interplay between Cassandra response time, throughput and its consistency settings considering different single- and multi-region deployment scenarios. The study uses the YCSB benchmarking framework and reports the results of the read and write performance tests of the three-replicated Cassandra cluster deployed in the Amazon AWS. In this paper, we also put forward a notation which can be used to formally describe distributed deployment of Cassandra cluster and its nodes relative to each other and to a client application. We present quantitative results showing how different consistency settings and deployment patterns affect Cassandra performance under different workloads. In particular, our experiments show that strong consistency costs up to 22 % of performance in case of the centralized Cassandra cluster deployment and can cause a 600 % increase in the read/write requests if Cassandra replicas and its clients are globally distributed across different AWS Regions.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 686 ◽  
Author(s):  
Ferdinando Villa ◽  
Stefano Balbi ◽  
Ioannis N. Athanasiadis ◽  
Caterina Caracciolo

Correct and reliable linkage of independently produced information is a requirement to enable sophisticated applications and processing workflows. These can ultimately help address the challenges posed by complex systems (such as socio-ecological systems), whose many components can only be described through independently developed data and model products. We discuss the first outcomes of an investigation in the conceptual and methodological aspects of semantic annotation of data and models, aimed to enable a high standard of interoperability of information. The results, operationalized in the context of a long-term, active, large-scale project on ecosystem services assessment, include: A definition of interoperability based on semantics and scale;A conceptual foundation for the phenomenology underlying scientific observations, aimed to guide the practice of semantic annotation in domain communities;A dedicated language and software infrastructure that operationalizes the findings and allows practitioners to reap the benefits of data and model interoperability. The work presented is the first detailed description of almost a decade of work with communities active in socio-ecological system modeling. After defining the boundaries of possible interoperability based on the understanding of scale, we discuss examples of the practical use of the findings to obtain consistent, interoperable and machine-ready semantic specifications that can integrate semantics across diverse domains and disciplines.


Sign in / Sign up

Export Citation Format

Share Document