scholarly journals Performance evaluation of various deployment scenarios of the 3-replicated Cassandra NoSQL cluster on AWS

2021 ◽  
pp. 157-165
Author(s):  
Anatoliy Gorbenko ◽  
Andrii Karpenko ◽  
Olga Tarasyuk

A concept of distributed replicated NoSQL data storages Cassandra-like, HBase, MongoDB has been proposed to effectively manage Big Data set whose volume, velocity and variability are difficult to deal with by using the traditional Relational Database Management Systems. Tradeoffs between consistency, availability, partition tolerance and latency is intrinsic to such systems. Although relations between these properties have been previously identified by the well-known CAP and PACELC theorems in qualitative terms, it is still necessary to quantify how different consistency settings, deployment patterns and other properties affect system performance.This experience report analysis performance of the Cassandra NoSQL database cluster and studies the tradeoff between data consistency guaranties and performance in distributed data storages. The primary focus is on investigating the quantitative interplay between Cassandra response time, throughput and its consistency settings considering different single- and multi-region deployment scenarios. The study uses the YCSB benchmarking framework and reports the results of the read and write performance tests of the three-replicated Cassandra cluster deployed in the Amazon AWS. In this paper, we also put forward a notation which can be used to formally describe distributed deployment of Cassandra cluster and its nodes relative to each other and to a client application. We present quantitative results showing how different consistency settings and deployment patterns affect Cassandra performance under different workloads. In particular, our experiments show that strong consistency costs up to 22 % of performance in case of the centralized Cassandra cluster deployment and can cause a 600 % increase in the read/write requests if Cassandra replicas and its clients are globally distributed across different AWS Regions.

10.28945/2245 ◽  
2015 ◽  
Author(s):  
Robert Thomas Mason

NoSQL databases are an important component of Big Data for storing and retrieving large volumes of data. Traditional Relational Database Management Systems (RDBMS) use the ACID theorem for data consistency, whereas NoSQL Databases use a non-transactional approach called BASE. RDBMS scale vertically and NoSQL Databases can scale both horizontally (sharding) and vertically. Four types of NoSQL databases are Document-oriented, Key-Value Pairs, Column-oriented and Graph. Data modeling for Document-oriented databases is similar to data modeling for traditional RDBMS during the conceptual and logical modeling phases. However, for a physical data model, entities can be combined (denormalized) by using embedding. What was once called a foreign key in a traditional RDBMS is now called a reference in a Document-oriented NoSQL database.


2020 ◽  
Vol 245 ◽  
pp. 04005 ◽  
Author(s):  
Michel Hernández Villanueva ◽  
Ikuo Ueda

The Belle II experiment, a major upgrade of the previous e+e− asymmetric collider experiment Belle, is expected to produce tens of petabytes of data per year due to the luminosity increase from the upgraded SuperKEKB accelerator. The distributed computing system of the Belle II experiment plays a key role, storing and distributing data in a reliable way to be easily accessed and analyzed by more than 1000 collaborators. In particular, the Belle II Raw Data Management system has been developed with an aim to upload output files onto grid storage, register them into the file and metadata catalogs, and make two replicas of the full raw data set using the Belle II Distributed Data Management system. It has been implemented as an extension of DIRAC (Distributed Infrastructure with Remote Agent Control) and consists of a database, services, client and monitoring tools, and several agents that treat the data automatically. The first year of data taken with the Belle II full detector has been managed by the Belle II Raw Data Management system successfully. The design, current status, and performance are presented. Prospects for improvements towards the full luminosity data taking are also reviewed.


2021 ◽  
Author(s):  
Greg Balco

<p>This abstract describes a project to make large data sets of cosmogenic-nuclide measurements useable for synoptic global analysis of paleoclimate, glacier change, and landscape change. It is based on the 'ICE-D' (Informal Cosmogenic-nuclide Exposure-age Database), a transparent-middle-layer infrastructure for compiling and storing cosmogenic-nuclide measurements and generating internally consistent exposure-age data. The prototype implementation of this project focuses on a global data set of exposure ages from glacial deposits that are, potentially, useful for synoptic analysis of glacier change and paleoclimate. The aim is to address a number of messy data-management and analysis problems associated with cosmogenic-nuclide data, thus making it possible to apply unbiased, automated quantitative analysis to the entire globally-distributed data set. The presentation will highlight (i) examples of error-tolerant hypothesis testing using this approach; (ii) means of quantifying the importance of the details of cosmogenic-nuclide production-rate calculations to global paleoclimate inferences, and (iii) likewise, approaches to understanding the importance of geomorphic processes and landform evolution to global paleoclimate inferences drawn from exposure-dated landforms.</p>


Author(s):  
Navaldeep Kaur ◽  
Lesley K. Fellows ◽  
Marie-Josée Brouillette ◽  
Nancy Mayo

Abstract Objectives: In the neuroHIV literature, cognitive reserve has most often been operationalized using education, occupation, and IQ. The effects of other cognitively stimulating activities that might be more amenable to interventions have been little studied. The purpose of this study was to develop an index of cognitive reserve in people with HIV, combining multiple indicators of cognitively stimulating lifetime experiences into a single value. Methods: The data set was obtained from a Canadian longitudinal study (N = 856). Potential indicators of cognitive reserve captured at the study entry included education, occupation, engagement in six cognitively stimulating activities, number of languages spoken, and social resources. Cognitive performance was measured using a computerized test battery. A cognitive reserve index was formulated using logistic regression weights. For the evidence on concurrent and predictive validity of the index, the measures of cognition and self-reported everyday functioning were each regressed on the index scores at study entry and at the last follow-up [mean duration: 25.9 months (SD 7.2)], respectively. Corresponding regression coefficients and 95% confidence intervals (CIs) were computed. Results: Professional sports [odds ratio (OR): 2.9; 95% CI 0.59–14.7], visual and performance arts (any level of engagement), professional/amateur music, complex video gaming and competitive games, and travel outside North America were associated with higher cognitive functioning. The effects of cognitive reserve on the outcomes at the last follow-up visit were closely similar to those at study entry. Conclusion: This work contributes evidence toward the relative benefit of engaging in specific cognitively stimulating life experiences in HIV.


2014 ◽  
Vol 52 (5) ◽  
pp. 897-915 ◽  
Author(s):  
Yan Chen ◽  
Yiwei Jiang ◽  
Chengqi Wang ◽  
Wen Chung Hsu

Purpose – The purpose of this paper is to examine how firm resources and diversification strategy explain the performance consequences of internationalization of emerging market enterprises. Design/methodology/approach – The paper conducts a regression analysis by using a novel panel data set comprising of 685 listed Chinese firms over the period of 2008-2011. Findings – The results show that the relationship between internationalization and performance is inverse U-shaped. Further, marketing resources play a greater role in enhancing the performance effects of internationalization than technological resources do. Related product diversification enhances the performance effects, while unrelated product diversification does the contrary. Research limitations/implications – The study focusses on listed firms in one country, and as a result, the findings cannot be generalized to non-listed firms and firms in other countries. Practical implications – This paper offers guidelines for international managers to improve performance of internationalization by developing a particular type of resources and diversification strategy. Originality/value – This paper extends the literature on the functional form of the internationalization-performance relationship, and further suggests that the analysis of the performance consequences of internationalization should go beyond the nexus between internationalization and performance, and focusses on firm-specific resources and strategies that may facilitate or constrain the performance effects of internationalization.


2015 ◽  
Vol 42 (12) ◽  
pp. 1071-1089
Author(s):  
Alan Chan ◽  
Bruce G. Fawcett ◽  
Shu-Kam Lee

Purpose – Church giving and attendance are two important indicators of church health and performance. In the literature, they are usually understood to be simultaneously determined. The purpose of this paper is to estimate if there a sustainable church congregation size using Wintrobe’s (1998) dictatorship model. The authors want to examine the impact of youth and adult ministry as well. Design/methodology/approach – Using the data collected from among Canadian Baptist churches in Eastern Canada, this study investigates the factors affecting the level of the two indicators by the panel-instrumental variable technique. Applying Wintrobe’s (1998) political economy model on dictatorship, the equilibrium level of worship attendance and giving is predicted. Findings – Through various simulation exercises, the actual church congregation sizes is approximately 50 percent of the predicted value, implying inefficiency and misallocation of church resources. The paper concludes with insights on effective ways church leaders can allocate scarce resources to promote growth within churches. Originality/value – The authors are the only researchers getting the permission from the Atlantic Canada Baptist Convention to use their mega data set on church giving and congregation sizes as per the authors’ knowledge. The authors are also applying a theoretical model on dictatorship to religious/not for profits organizations.


2021 ◽  
pp. 1-45
Author(s):  
Benjamin Leard ◽  
Joshua Linn ◽  
Yichen Christy Zhou

Abstract During historical periods in which US fuel economy standards were unchanging, automakers increased performance but not fuel economy, contrasting with recent periods of tightening standards and rising fuel economy. This paper evaluates the welfare consequences of automakers forgoing performance increases to raise fuel economy as standards have tightened since 2012. Using a unique data set and a novel approach to account for fuel economy and performance endogeneity, we find undervaluation of fuel cost savings and high valuation of performance. Welfare costs of forgone performance approximately equal expected fuel savings benefits, suggesting approximately zero net private consumer benefit from tightened standards.


2018 ◽  
Vol 19 (5) ◽  
pp. 915-934 ◽  
Author(s):  
Gianluca Ginesti ◽  
Adele Caldarelli ◽  
Annamaria Zampella

Purpose The purpose of this paper is to analyse the impact of intellectual capital (IC) on the reputation and performance of Italian companies. Design/methodology/approach The paper exploits a unique data set of 452 non-listed companies that obtained a reputational assessment from the Italian Competition Authority (ICA). To test the hypotheses, this study implemented several regression analyses. Findings Results support the argument that human capital efficiency is a key driver of corporate reputation. Findings also reveal that companies, which obtained reputational rating under ICA scrutiny, show a positive relationship between IC elements and various measures of financial performance. Research limitations/implications The study focuses on a single country; it is not free from the imprecisions of Pulic’s VAIC model. Practical implications This paper recommends companies that are interested to achieve a robust reputation should consider the human capital as a strategic intangible asset. Second, the results suggest that companies with an ICA reputational rating are able to leverage their intangibles to potentiate performance and competitiveness. Originality/value This is the first empirical investigation on the contribution of IC in generating value for corporate reputation. Additionally, the study contributes to the literature on the link between IC and performance by examining a sample of firms not yet explored in prior research.


2019 ◽  
Vol 23 (1) ◽  
pp. 41-62 ◽  
Author(s):  
Valentina Ndou ◽  
Giovanni Schiuma ◽  
Giuseppina Passiante

PurposeThe creative process through which the territorial resources, knowledge and culture are used, exploited and configured to match needs and to achieve congruence with the changing business environment has become a crucial process for competitiveness. This is even more relevant for economies of developing countries which are continuously struggling to reap the benefits of globalisation, as well as to grasp the new opportunities for competitiveness. As such, this paper aims to try to concentrate on the dynamic perspectives of the creative economy of countries by distinguishing between the potentialities and performance. The paper tackles the influence that creativity capacities might have on performance of countries.Design/methodology/approachThe methodology consists in identifying creative economy indicators from a diverse data set of the World Economic Forum and distinguish them between potential and performance indicators.FindingsData reveal as good progress and emphasis is being devoted to increasing the level of creativity; however, the Balkan countries still holdup in their capacity to boost innovation.Practical implicationsThe paper provide a new focus of research on creativity measurement that is significant for understanding what creative capacities territories possess and the ability to make proficient use for growth and innovation.Originality/valueThis paper proposes a new operational framework for measuring and interpreting the creative economy indicators by identifying not only indicators that gauge the potentialities of a country, but also indicators that are linked with the performance dimension, as well as the relationship amongst them.


Author(s):  
Daniel Lukic ◽  
Jonas Eberle ◽  
Jana Thormann ◽  
Carolus Holzschuh ◽  
Dirk Ahrens

DNA-barcoding and DNA-based species delimitation are major tools in DNA taxonomy. Sampling has been a central debate in this context, because the geographical composition of samples affect the accuracy and performance of DNA-barcoding. Performance of complex DNA-based species delimitation is to be tested under simpler conditions in absence of geographic sampling bias. Here, we present an empirical data set sampled from a single locality in a Southeast-Asian biodiversity hotspot (Laos: Phou Pan mountain). We investigate the performance of various species delimitation approaches on a megadiverse assemblage of herbivore chafer beetles (Coleoptera: Scarabaeidae) to infer whether species delimitation suffers in the same way from exaggerate infraspecific variation despite the lack of geographic genetic variation that led to inconsistencies between entities from DNA-based and morphology-based species inference in previous studies. For this purpose, a 658 bp fragment of the mitochondrial cytochrome c oxidase subunit 1 (cox1) was analysed for a total of 186 individuals of 56 morphospecies. Tree based and distance based species delimitation methods were used. All approaches showed a rather limited match ratio (max. 77%) with morphospecies. PTP and TCS prevailingly over-splitted morphospecies, while 3% clustering and ABGD also lumped several species into one entity. ABGD revealed the highest congruence between molecular operational taxonomic units (MOTUs) and morphospecies. Disagreements between morphospecies and MOTUs were discussed in the context of historically acquired geographic genetic differentiation, incomplete lineage sorting, and hybridization. The study once again highlights how important morphology still is in order to correctly interpret the results of molecular species delimitation.


Sign in / Sign up

Export Citation Format

Share Document