scholarly journals HBASE Performance Analysis in Big Datasets Processing

TEM Journal ◽  
2021 ◽  
pp. 1051-1057
Author(s):  
Tsvetelina Mladenova ◽  
Yordan Kalmkov ◽  
Milko Marinov ◽  
Irena Valova

The term Big Data has gained popularity in recent years due to technological developments and the accumulation of data from various sources, mobile devices and sensors. Hbase is a distributed open source environment that uses available disk space optimally and efficiently based on data. It organizes data in a very different way from standard relational databases and works with both structured and unstructured data. This article describes our experience and research on how the execution time for inserting datasets and selecting data depends on the size of the data volumes, the locations (nodes of the same or different networks) from which they send or retrieve and what is the effect of the selected data organization (especially RowKey design) on the execution time.


2018 ◽  
Vol 2018 ◽  
pp. 1-14 ◽  
Author(s):  
Qassim Nasir ◽  
Ilham A. Qasse ◽  
Manar Abu Talib ◽  
Ali Bou Nassif

Blockchain is a key technology that has the potential to decentralize the way we store, share, and manage information and data. One of the more recent blockchain platforms that has emerged is Hyperledger Fabric, an open source, permissioned blockchain that was introduced by IBM, first as Hyperledger Fabric v0.6, and then more recently, in 2017, IBM released Hyperledger Fabric v1.0. Although there are many blockchain platforms, there is no clear methodology for evaluating and assessing the different blockchain platforms in terms of their various aspects, such as performance, security, and scalability. In addition, the new version of Hyperledger Fabric was never evaluated against any other blockchain platform. In this paper, we will first conduct a performance analysis of the two versions of Hyperledger Fabric, v0.6 and v1.0. The performance evaluation of the two platforms will be assessed in terms of execution time, latency, and throughput, by varying the workload in each platform up to 10,000 transactions. Second, we will analyze the scalability of the two platforms by varying the number of nodes up to 20 nodes in each platform. Overall, the performance analysis results across all evaluation metrics, scalability, throughput, execution time, and latency, demonstrate that Hyperledger Fabric v1.0 consistently outperforms Hyperledger Fabric v0.6. However, Hyperledger Fabric v1.0 platform performance did not reach the performance level in current traditional database systems under high workload scenarios.



2019 ◽  
Author(s):  
Ziqi Li

NoSQL databases are open-source, schema-less, horizontally scalable and high-performance databases. These characteristics make them very different from relational databases, the traditional choice for spatial data. The four types of data stores in NoSQL databases (key-value store, document store, column store, and graph store) contribute to significant flexibility for a range of applications. NoSQL databases are well suited to handle typical challenges of big data, including volume, variety, and velocity. For these reasons, they are increasingly adopted by private industries and used in research. They have gained tremendous popularity in the last decade due to their ability to manage unstructured data (e.g. social media data).



2014 ◽  
Author(s):  
Wenkuang Wu ◽  
Xiaoguang Lu ◽  
Ben Cox ◽  
Guoqiang Li ◽  
Lihua Lin ◽  
...  




2015 ◽  
Vol 2015 ◽  
pp. 1-16 ◽  
Author(s):  
Ashwin Belle ◽  
Raghuram Thiagarajan ◽  
S. M. Reza Soroushmehr ◽  
Fatemeh Navidi ◽  
Daniel A. Beard ◽  
...  

The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.





Sign in / Sign up

Export Citation Format

Share Document