Modeling and Evaluating the Effects of Big Data Storage Resource Allocation in Global Scale Cloud Architectures

The availability of powerful, worldwide span computing facilities offering application scalability by means of cloud infrastructures perfectly matches the needs for resources that characterize Big Data applications. Elasticity of resources in the cloud enables application providers to achieve results in terms of complexity, performance and availability that were considered beyond affordability, by means of proper resource management techniques and a savvy design of the underlying architecture and of communication facilities. This paper presents an evaluation technique for the combined effects of cloud elasticity and Big Data oriented data management layer on global scale cloud applications, by modeling the behavior of both typical in memory and in storage data management.

Download Full-text

Computer Data Storage and Management Platform Based on Big Data

Journal of Physics Conference Series ◽

10.1088/1742-6596/2066/1/012022 ◽

2021 ◽

Vol 2066 (1) ◽

pp. 012022

Author(s):

Cheng Luo

Keyword(s):

Big Data ◽

Data Management ◽

Performance Improvement ◽

Data Storage ◽

Nonvolatile Memory ◽

Linear Structure ◽

Ring Structure ◽

Computer Data ◽

Management Platform ◽

A Performance

Abstract Due to the continuous development of information technology, data has increasingly become the core of the daily operation of enterprises and institutions, the main basis for decision-making development. At the same time, due to the development of network, the storage and management of computer data has attracted more and more attention. Aiming at the common problems of computer data storage and management in practical work, this paper analyzes the object and content of data management, investigates the situation of computer data storage and management in China in recent two years, and interviews and tests the data of programming in this design platform. At the same time, in view of the related problems, the research results are applied to practice. On the basis of big data, the storage and management platform is designed. The research and design adopts a special B+ tree node linear structure of CIRC tree, and the linear node structure is changed into a ring structure, which greatly reduces the number of data persistence instructions and the performance overhead. The results show that compared with the most advanced B+ tree design for nonvolatile memory, crab tree has 3.1 times and 2.5 times performance improvement in reading and writing, respectively. Compared with the previous NV tree designed for nonvolatile memory, it has a performance improvement of 1.5 times, and a performance improvement of 8.4 times compared with the latest fast-fair. In the later stage, the expansion of the platform functions is conducive to the analysis and construction of data related storage and management functions, and further improve the ability of data management.

Download Full-text

Research on Key Technologies of Railway Master Data Management for Big Data Applications

Proceedings of the 2018 7th International Conference on Energy and Environmental Protection (ICEEP 2018) ◽

10.2991/iceep-18.2018.68 ◽

2018 ◽

Author(s):

Yifei Liu ◽

Ping Li ◽

Lianbao Yang

Keyword(s):

Big Data ◽

Data Management ◽

Big Data Applications ◽

Master Data ◽

Key Technologies ◽

Master Data Management

Download Full-text

Architecture for Big Data Storage in Different Cloud Deployment Models

Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing ◽

10.4018/978-1-7998-5339-8.ch009 ◽

2021 ◽

pp. 178-208

Author(s):

Chandu Thota ◽

Gunasekaran Manogaran ◽

Daphne Lopez ◽

Revathi Sundarasekar

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Storage ◽

High Performance ◽

Data Services ◽

Big Data Applications ◽

Nosql Database ◽

Amazon Web Services ◽

Product Domains ◽

Scalable Database

Cloud Computing is a new computing model that distributes the computation on a resource pool. The need for a scalable database capable of expanding to accommodate growth has increased with the growing data in web world. More familiar Cloud Computing vendors such as Amazon Web Services, Microsoft, Google, IBM and Rackspace offer cloud based Hadoop and NoSQL database platforms to process Big Data applications. Variety of services are available that run on top of cloud platforms freeing users from the need to deploy their own systems. Nowadays, integrating Big Data and various cloud deployment models is major concern for Internet companies especially software and data services vendors that are just getting started themselves. This chapter proposes an efficient architecture for integration with comprehensive capabilities including real time and bulk data movement, bi-directional replication, metadata management, high performance transformation, data services and data quality for customer and product domains.

Download Full-text

Performance Analysis of NoSQL and Relational Databases with CouchDB and MySQL for Application’s Data Storage

Applied Sciences ◽

10.3390/app10238524 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8524

Author(s):

Cornelia A. Győrödi ◽

Diana V. Dumşe-Burescu ◽

Doina R. Zmaranda ◽

Robert Ş. Győrödi ◽

Gianina A. Gabor ◽

...

Keyword(s):

Big Data ◽

Data Storage ◽

Relational Databases ◽

Database Systems ◽

Application Performance ◽

Big Data Applications ◽

Database Technology ◽

Important Challenge ◽

Big Data Application ◽

The Impact

In the current context of emerging several types of database systems (relational and non-relational), choosing the type and database system for storing large amounts of data in today’s big data applications has become an important challenge. In this paper, we aimed to provide a comparative evaluation of two popular open-source database management systems (DBMSs): MySQL as a relational DBMS and, more recently, as a non-relational DBMS, and CouchDB as a non-relational DBMS. This comparison was based on performance evaluation of CRUD (CREATE, READ, UPDATE, DELETE) operations for different amounts of data to show how these two databases could be modeled and used in an application and highlight the differences in the response time and complexity. The main objective of the paper was to make a comparative analysis of the impact that each specific DBMS has on application performance when carrying out CRUD requests. To perform the analysis and to ensure the consistency of tests, two similar applications were developed in Java, one using MySQL and the other one using CouchDB database; these applications were further used to evaluate the time responses for each database technology on the same CRUD operations on the database. Finally, a comprehensive discussion based on the results of the analysis was performed that centered on the results obtained and several conclusions were revealed. Advantages and drawbacks for each DBMS are outlined to support a decision for choosing a specific type of DBMS that could be used in a big data application.

Download Full-text

How does DICOM support big data management? Investigating its use in medical imaging community

Insights into Imaging ◽

10.1186/s13244-021-01081-8 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Marco Aiello ◽

Giuseppina Esposito ◽

Giulio Pagliari ◽

Pasquale Borrelli ◽

Valentina Brancato ◽

...

Keyword(s):

Big Data ◽

Medical Imaging ◽

Data Management ◽

Data Sharing ◽

Data Storage ◽

Big Data Analytics ◽

Software Tools ◽

Imaging Data ◽

Privacy Concerns ◽

Reading And Writing

AbstractThe diagnostic imaging field is experiencing considerable growth, followed by increasing production of massive amounts of data. The lack of standardization and privacy concerns are considered the main barriers to big data capitalization. This work aims to verify whether the advanced features of the DICOM standard, beyond imaging data storage, are effectively used in research practice. This issue will be analyzed by investigating the publicly shared medical imaging databases and assessing how much the most common medical imaging software tools support DICOM in all its potential. Therefore, 100 public databases and ten medical imaging software tools were selected and examined using a systematic approach. In particular, the DICOM fields related to privacy, segmentation and reporting have been assessed in the selected database; software tools have been evaluated for reading and writing the same DICOM fields. From our analysis, less than a third of the databases examined use the DICOM format to record meaningful information to manage the images. Regarding software, the vast majority does not allow the management, reading and writing of some or all the DICOM fields. Surprisingly, if we observe chest computed tomography data sharing to address the COVID-19 emergency, there are only two datasets out of 12 released in DICOM format. Our work shows how the DICOM can potentially fully support big data management; however, further efforts are still needed from the scientific and technological community to promote the use of the existing standard, encouraging data sharing and interoperability for a concrete development of big data analytics.

Download Full-text

Database Systems for Big Data Storage and Retrieval

Advances in Data Mining and Database Management - Handbook of Research on Big Data Storage and Visualization Techniques ◽

10.4018/978-1-5225-3142-5.ch003 ◽

2018 ◽

pp. 76-100 ◽

Cited By ~ 2

Author(s):

Venkat Gudivada ◽

Amy Apon ◽

Dhana L. Rao

Keyword(s):

Big Data ◽

Data Storage ◽

Database Systems ◽

Query Languages ◽

Research Issues ◽

Storage And Retrieval ◽

Emerging Trends ◽

Big Data Applications ◽

Data Application ◽

Big Data Application

Special needs of Big Data applications have ushered in several new classes of systems for data storage and retrieval. Each class targets the needs of a category of Big Data application. These systems differ greatly in their data models and system architecture, approaches used for high availability and scalability, query languages and client interfaces provided. This chapter begins with a description of the emergence of Big Data and data management requirements of Big Data applications. Several new classes of database management systems have emerged recently to address the needs of Big Data applications. NoSQL is an umbrella term used to refer to these systems. Next, a taxonomy for NoSQL systems is developed and several NoSQL systems are classified under this taxonomy. Characteristics of representative systems in each class are also discussed. The chapter concludes by indicating the emerging trends of NoSQL systems and research issues.

Download Full-text

Efficient data management on 3D stacked memory for big data applications

2015 10th International Design & Test Symposium (IDT) ◽

10.1109/idt.2015.7396741 ◽

2015 ◽

Cited By ~ 2

Author(s):

Cheng Qian ◽

Libo Huang ◽

Peng Xie ◽

Nong Xiao ◽

Zhiying Wang

Keyword(s):

Big Data ◽

Data Management ◽

Big Data Applications ◽

Efficient Data ◽

Stacked Memory

Download Full-text

Serverless Computing Platform for Big Data Storage

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c5471.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 6592-6595

Keyword(s):

Big Data ◽

Data Storage ◽

Cloud Storage ◽

Optimal Solution ◽

Cloud Provider ◽

Random Load ◽

Data Movement ◽

Computing Platform ◽

Big Data Applications ◽

Load Increase

This paper describes various challenges faced by the Big Data cloud providers and the challenges encountered by its users. This foreshadows that the Serverless computing as the feasible platform for Big Data application’s data storages. The literature research undertaken focuses on various Serverless computing architectural designs, computational methodologies, performance, data movement and functions. The framework for Serverless cloud computing is discussed and its performance is tested for the metric of scaling in the Serverless cloud storage for Big Data applications. The results of the analyses and its outcome are also discussed. Thus suggesting that the scaling of Serverless cloud storage for data storage during random load increase as the optimal solution for cloud provider and Big Data application user.

Download Full-text

Sports Big Data: Management, Analysis, Applications, and Challenges

Complexity ◽

10.1155/2021/6676297 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Zhongbo Bai ◽

Xiaomei Bai

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Management ◽

Rapid Growth ◽

Growth Trend ◽

Research Directions ◽

Research Issues ◽

Big Data Applications ◽

Rich Information ◽

Sports Data

With the rapid growth of information technology and sports, analyzing sports information has become an increasingly challenging issue. Sports big data come from the Internet and show a rapid growth trend. Sports big data contain rich information such as athletes, coaches, athletics, and swimming. Nowadays, various sports data can be easily accessed, and amazing data analysis technologies have been developed, which enable us to further explore the value behind these data. In this paper, we first introduce the background of sports big data. Secondly, we review sports big data management such as sports big data acquisition, sports big data labeling, and improvement of existing data. Thirdly, we show sports data analysis methods, including statistical analysis, sports social network analysis, and sports big data analysis service platform. Furthermore, we describe the sports big data applications such as evaluation and prediction. Finally, we investigate representative research issues in sports big data areas, including predicting the athletes’ performance in the knowledge graph, finding a rising star of sports, unified sports big data platform, open sports big data, and privacy protections. This paper should help the researchers obtaining a broader understanding of sports big data and provide some potential research directions.

Download Full-text

Software-Defined Networking for Scalable Cloud-based Services to Improve System Performance of Hadoop-based Big Data Applications

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2016040101 ◽

2016 ◽

Vol 8 (2) ◽

pp. 1-22 ◽

Cited By ~ 7

Author(s):

Desta Haileselassie Hagos

Keyword(s):

Big Data ◽

System Performance ◽

Dynamic Network ◽

Software Defined Networking ◽

Network Control ◽

Virtual Networks ◽

Big Data Applications ◽

New Research ◽

Cloud Infrastructures ◽

The Impact

The rapid growth of Cloud Computing has brought with it major new challenges in the automated manageability, dynamic network reconfiguration, provisioning, scalability and flexibility of virtual networks. OpenFlow-enabled Software-Defined Networking (SDN) alleviates these key challenges through the abstraction of lower level functionality that removes the complexities of the underlying hardware by separating the data and control planes. SDN has an efficient, dynamic, automated network management, higher availability and application provisioning through programmable interfaces which are very critical for flexible and scalable cloud-based services. In this study, the author explores broadly useful open technologies and methodologies for applying an OpenFlow-enabled SDN to scalable cloud-based services and a variety of diverse applications. The approach in this paper introduces new research challenges in the design and implementation of advanced techniques for bringing an SDN-enabled components and big data applications into a cloud environment in a dynamic setting. Some of these challenges become pressing concerns to cloud providers when managing virtual networks and data centers, while others complicate the development and deployment of cloud-hosted applications from the perspective of developers and end users. However, the growing demand for manageable, scalable and flexible clouds necessitates that effective solutions to these challenges be found. Hence, through real-world research validation use cases, this paper aims at exploring useful mechanisms for the role and potential of an OpenFlow-enabled SDN and its direct benefit for scalable cloud-based services. Finally, it demonstrates the impact of an OpenFlow-enabled SDN that fully embraces the opportunities and challenges of cloud infrastructures to improve the system performance of Hadoop-based big data applications by utilizing the network control capabilities of an OpenFlow to solve network congestion.

Download Full-text