Globally Distributed Data

A concept of distributed replicated NoSQL data storages Cassandra-like, HBase, MongoDB has been proposed to effectively manage Big Data set whose volume, velocity and variability are difficult to deal with by using the traditional Relational Database Management Systems. Tradeoffs between consistency, availability, partition tolerance and latency is intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP and PACELC theorems in qualitative terms, it is still necessary to quantify how different consistency settings, deployment patterns and other properties affect system performance.This experience report analysis performance of the Cassandra NoSQL database cluster and studies the tradeoff between data consistency guaranties and performance in distributed data storages. The primary focus is on investigating the quantitative interplay between Cassandra response time, throughput and its consistency settings considering different single- and multi-region deployment scenarios. The study uses the YCSB benchmarking framework and reports the results of the read and write performance tests of the three-replicated Cassandra cluster deployed in the Amazon AWS. In this paper, we also put forward a notation which can be used to formally describe distributed deployment of Cassandra cluster and its nodes relative to each other and to a client application. We present quantitative results showing how different consistency settings and deployment patterns affect Cassandra performance under different workloads. In particular, our experiments show that strong consistency costs up to 22 % of performance in case of the centralized Cassandra cluster deployment and can cause a 600 % increase in the read/write requests if Cassandra replicas and its clients are globally distributed across different AWS Regions.

Download Full-text

International Geosphere and Biosphere Program (IGBP) metadata information system for globally distributed data sets

Proceedings Thirteenth IEEE Symposium on Mass Storage Systems. Toward Distributed Storage and Data Management Systems ◽

10.1109/mass.1994.373016 ◽

2002 ◽

Author(s):

S.I. Rasool ◽

L. Andres

Keyword(s):

Information System ◽

Data Sets ◽

Distributed Data ◽

Globally Distributed

Download Full-text

Synoptic analysis of globally-distributed data sets of cosmogenic-nuclide exposure ages

10.5194/egusphere-egu21-3513 ◽

2021 ◽

Author(s):

Greg Balco

Keyword(s):

Global Analysis ◽

Glacier Change ◽

Data Sets ◽

Distributed Data ◽

Data Set ◽

Cosmogenic Nuclide ◽

Synoptic Analysis ◽

Exposure Age ◽

Exposure Ages ◽

Globally Distributed

<p>This abstract describes a project to make large data sets of cosmogenic-nuclide measurements useable for synoptic global analysis of paleoclimate, glacier change, and landscape change. It is based on the 'ICE-D' (Informal Cosmogenic-nuclide Exposure-age Database), a transparent-middle-layer infrastructure for compiling and storing cosmogenic-nuclide measurements and generating internally consistent exposure-age data. The prototype implementation of this project focuses on a global data set of exposure ages from glacial deposits that are, potentially, useful for synoptic analysis of glacier change and paleoclimate. The aim is to address a number of messy data-management and analysis problems associated with cosmogenic-nuclide data, thus making it possible to apply unbiased, automated quantitative analysis to the entire globally-distributed data set. The presentation will highlight (i) examples of error-tolerant hypothesis testing using this approach; (ii) means of quantifying the importance of the details of cosmogenic-nuclide production-rate calculations to global paleoclimate inferences, and (iii) likewise, approaches to understanding the importance of geomorphic processes and landform evolution to global paleoclimate inferences drawn from exposure-dated landforms.</p>

Download Full-text

The bounds of the distributed data-intensive computing systems

Pollack Periodica ◽

10.1556/pollack.2.2007.s.8 ◽

2007 ◽

Vol 2 (Supplement 1) ◽

pp. 85-96 ◽

Cited By ~ 1

Author(s):

Antal Buza

Keyword(s):

Distributed Data ◽

Data Intensive Computing ◽

Computing Systems ◽

Data Intensive

Download Full-text

Security Test by using F T M and Data Allocation Strategies on Leakage Detection

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v4i2b1.3227 ◽

2005 ◽

Vol 4 (2) ◽

pp. 393-400

Author(s):

Pallavali Radha ◽

G. Sireesha

Keyword(s):

Third Party ◽

Distributed Data ◽

Data Allocation ◽

Security Testing ◽

Sensitive Data ◽

Sample Data ◽

The One ◽

Security Test ◽

Data Request ◽

Allocation Strategies

The data distributors work is to give sensitive data to a set of presumably trusted third party agents.The data i.e., sent to these third parties are available on the unauthorized places like web and or some ones systems, due to data leakage. The distributor must know the way the data was leaked from one or more agents instead of as opposed to having been independently gathered by other means. Our new proposal on data allocation strategies will improve the probability of identifying leakages along with Security attacks typically result from unintended behaviors or invalid inputs. Â Due to too many invalid inputs in the real world programs is labor intensive about security testing.The most desirable thing is to automate or partially automate security-testing process. In this paper we represented Predicate/ Transition nets approach for security tests automated generationby using formal threat models to detect the agents using allocation strategies without modifying the original data.The guilty agent is the one who leaks the distributed data. To detect guilty agents more effectively the idea is to distribute the data intelligently to agents based on sample data request and explicit data request. The fake object implementation algorithms will improve the distributor chance of detecting guilty agents.

Download Full-text

Activity of public control entities and development of distributed computing and distributed data storage systems

Journal of Law and Administration ◽

10.24833/2073-8420-2018-1-46-14-22 ◽

2018 ◽

pp. 14-22

Author(s):

D. V. Gribanov

Keyword(s):

Distributed Computing ◽

Data Storage ◽

Storage Systems ◽

Legal Regulation ◽

Distributed Data ◽

Distributed Data Storage ◽

Public Control ◽

Blockchain Technology ◽

Legal Method ◽

Digital Assets

Introduction. This article is devoted to legal regulation of digital assets turnover, utilization possibilities of distributed computing and distributed data storage systems in activities of public authorities and entities of public control. The author notes that some national and foreign scientists who study a “blockchain” technology (distributed computing and distributed data storage systems) emphasize its usefulness in different activities. Data validation procedure of digital transactions, legal regulation of creation, issuance and turnover of digital assets need further attention.Materials and methods. The research is based on common scientific (analysis, analogy, comparing) and particular methods of cognition of legal phenomena and processes (a method of interpretation of legal rules, a technical legal method, a formal legal method and a formal logical one).Results of the study. The author conducted an analysis which resulted in finding some advantages of the use of the “blockchain” technology in the sphere of public control which are as follows: a particular validation system; data that once were entered in the system of distributed data storage cannot be erased or forged; absolute transparency of succession of actions while exercising governing powers; automatic repeat of recurring actions. The need of fivefold validation of exercising governing powers is substantiated. The author stresses that the fivefold validation shall ensure complex control over exercising of powers by the civil society, the entities of public control and the Russian Federation as a federal state holding sovereignty over its territory. The author has also conducted a brief analysis of judicial decisions concerning digital transactions.Discussion and conclusion. The use of the distributed data storage system makes it easier to exercise control due to the decrease of risks of forge, replacement or termination of data. The author suggests defining digital transaction not only as some actions with digital assets, but also as actions toward modification and addition of information about legal facts with a purpose of its establishment in the systems of distributed data storage. The author suggests using the systems of distributed data storage for independent validation of information about activities of the bodies of state authority. In the author’s opinion, application of the “blockchain” technology may result not only in the increase of efficiency of public control, but also in the creation of a new form of public control – automatic control. It is concluded there is no legislation basis for regulation of legal relations concerning distributed data storage today.

Download Full-text

Towards a distributed data fusion pipeline for pedestrian behaviour analysis

10.26226/morressier.5e4fe9bd6bc493207536f766 ◽

2020 ◽

Author(s):

Sara Ferreira ◽

Thiago RPM Rúbioa ◽

João Jacob ◽

Henrique Lopes Cardoso ◽

Daniel Castro Silva ◽

...

Keyword(s):

Data Fusion ◽

Distributed Data ◽

Behaviour Analysis ◽

Distributed Data Fusion

Download Full-text

Uji Komparasi Abnormal Return, Trading Volume, Trading Frequency, Dan Bid-Ask Spread Sebelum Dan Sesudah Share Split

Jurnal ULTIMA Accounting ◽

10.31937/akuntansi.v8i2.580 ◽

2016 ◽

Vol 8 (2) ◽

pp. 24-45

Author(s):

Tania Hayu Safira ◽

Febryanti Simon

Keyword(s):

Abnormal Return ◽

Trading Volume ◽

Stock Exchange ◽

Distributed Data ◽

Trading Frequency ◽

Bid Ask Spread ◽

Analysis Technique ◽

Before And After ◽

Test Of Hypothesis ◽

Kolmogorov Smirnov

This study is event study that was conduct to examine the differences of abnormal return, trading volume, trading frequency and bid-ask spread before and after the events of share split. The object of this research is the companies that did share split and listed in Indonesia Stock Exchange in 2008 - 2015. The samples are 30 companies chosen by purposive sampling method. The criteria are the company did not do corporate action right issue, pre-emptive rights, a share dividend and bonus shares in the same year with share split. Event window used in this study was 30 days consisting of 15 days before and 15 days after the share split. Data analysis technique begins with a test of normality using Kolmogorov – Smirnov and transform for unnormally distributed data. Then, test of hypothesis using Paired t – test to compare the differences before and after share split. The results of this study showed that volume trading activity and trading frequency had significant differences before and after the share split. While, variable abnormal return and bid-ask spread had not significant differences before and after the share split. Keywords: Abnormal return, bid-ask spread, share split, trading frequency, trading volume.

Download Full-text

Information processing systems. Fibre Distributed Data Interface (FDDI)

10.3403/00496875u ◽

2015 ◽

Keyword(s):

Information Processing ◽

Distributed Data

Download Full-text

Globally Distributed Data

Storage Resource Broker; Generic Software Infrastructure for Managing Globally Distributed Data

Performance evaluation of various deployment scenarios of the 3-replicated Cassandra NoSQL cluster on AWS

International Geosphere and Biosphere Program (IGBP) metadata information system for globally distributed data sets

Synoptic analysis of globally-distributed data sets of cosmogenic-nuclide exposure ages

The bounds of the distributed data-intensive computing systems

Security Test by using F T M and Data Allocation Strategies on Leakage Detection

Activity of public control entities and development of distributed computing and distributed data storage systems

Towards a distributed data fusion pipeline for pedestrian behaviour analysis

Uji Komparasi Abnormal Return, Trading Volume, Trading Frequency, Dan Bid-Ask Spread Sebelum Dan Sesudah Share Split

Information processing systems. Fibre Distributed Data Interface (FDDI)

Export Citation Format