The impact of hardware on database systems

Author(s):  
Patricia G. Selinger
Keyword(s):  
Big Data ◽  
2016 ◽  
pp. 1495-1518
Author(s):  
Mohammad Alaa Hussain Al-Hamami

Big Data is comprised systems, to remain competitive by techniques emerging due to Big Data. Big Data includes structured data, semi-structured and unstructured. Structured data are those data formatted for use in a database management system. Semi-structured and unstructured data include all types of unformatted data including multimedia and social media content. Among practitioners and applied researchers, the reaction to data available through blogs, Twitter, Facebook, or other social media can be described as a “data rush” promising new insights about consumers' choices and behavior and many other issues. In the past Big Data has been used just by very large organizations, governments and large enterprises that have the ability to create its own infrastructure for hosting and mining large amounts of data. This chapter will show the requirements for the Big Data environments to be protected using the same rigorous security strategies applied to traditional database systems.


2020 ◽  
Vol 10 (23) ◽  
pp. 8524
Author(s):  
Cornelia A. Győrödi ◽  
Diana V. Dumşe-Burescu ◽  
Doina R. Zmaranda ◽  
Robert Ş. Győrödi ◽  
Gianina A. Gabor ◽  
...  

In the current context of emerging several types of database systems (relational and non-relational), choosing the type and database system for storing large amounts of data in today’s big data applications has become an important challenge. In this paper, we aimed to provide a comparative evaluation of two popular open-source database management systems (DBMSs): MySQL as a relational DBMS and, more recently, as a non-relational DBMS, and CouchDB as a non-relational DBMS. This comparison was based on performance evaluation of CRUD (CREATE, READ, UPDATE, DELETE) operations for different amounts of data to show how these two databases could be modeled and used in an application and highlight the differences in the response time and complexity. The main objective of the paper was to make a comparative analysis of the impact that each specific DBMS has on application performance when carrying out CRUD requests. To perform the analysis and to ensure the consistency of tests, two similar applications were developed in Java, one using MySQL and the other one using CouchDB database; these applications were further used to evaluate the time responses for each database technology on the same CRUD operations on the database. Finally, a comprehensive discussion based on the results of the analysis was performed that centered on the results obtained and several conclusions were revealed. Advantages and drawbacks for each DBMS are outlined to support a decision for choosing a specific type of DBMS that could be used in a big data application.


2021 ◽  
Vol 19 ◽  
pp. 151-158
Author(s):  
Piotr Rymarski ◽  
Grzegorz Kozieł

Most of today's web applications run on relational database systems. Communication with them is possible through statements written in Structured Query Language (SQL). This paper presents the most popular relational database management systems and describes common ways to optimize SQL queries. Using the research environment based on fragment of the imdb.com database, implementing OracleDb, MySQL, Microsoft SQL Server and PostgreSQL engines, a number of test scenarios were performed. The aim was to check the performance changes of SQL queries resulting from syntax modication while maintaining the result, the impact of database organization, indexing and advanced mechanisms aimed at increasing the eciency of operations performed, delivered in the systems used. The tests were carried out using a proprietary application written in Java using the Hibernate framework.


2020 ◽  
Vol 13 (7) ◽  
pp. 1064-1077 ◽  
Author(s):  
Supreeth Shastri ◽  
Vinay Banakar ◽  
Melissa Wasserman ◽  
Arun Kumar ◽  
Vijay Chidambaram
Keyword(s):  

1996 ◽  
Vol 7 (2) ◽  
pp. 24-33
Author(s):  
Victor C.S. Lee ◽  
Kam-Yiu Lam ◽  
Kwok-Wa Lam ◽  
Joseph K.Y. Ng

Author(s):  
David Gamero ◽  
Andrew Dugenske ◽  
Thomas Kurfess ◽  
Christopher Saldana ◽  
Katherine Fu

Abstract In this paper, the design and performance differences between Relational Database Management Systems (RDBMS) and NoSQL Database Systems are examined, with attention to their applicability for real-world Internet of Things for manufacturing (IoTfM) data. While previous work has extensively compared SQL and NoSQL for both generalized and IoT uses, this work specifically examines the tradeoffs and performance differences for manufacturing applications by using a high-fidelity data set collected from a large US manufacturing firm. Growing an IoT system beyond the pilot stage requires scalable data storage; this work seeks to determine the impact of selected database systems on data write performance at scale. Payload size and message frequency were used as the primary characteristics to maintain model fidelity in simulated clients. As the number of simulated asset clients grow, the data write latency was calculated to determine how both database systems’ performance were affected. To isolate the RDBMS and NoSQL differences, a cloud environment was created using Amazon Web Services (AWS) with two identical data ingestion pipelines: writing data to an RDMBS (1) using AWS Aurora MySQL, and (2) using AWS DynamoDB NoSQL. The findings may provide guidance for further experimentation in large-scale manufacturing IoT implementations.


2021 ◽  
Author(s):  
Utku Sirin ◽  
Pınar Tözün ◽  
Danica Porobic ◽  
Ahmad Yasin ◽  
Anastasia Ailamaki

AbstractMicro-architectural behavior of traditional disk-based online transaction processing (OLTP) systems has been investigated extensively over the past couple of decades. Results show that traditional OLTP systems mostly under-utilize the available micro-architectural resources. In-memory OLTP systems, on the other hand, process all the data in main-memory and, therefore, can omit the buffer pool. Furthermore, they usually adopt more lightweight concurrency control mechanisms, cache-conscious data structures, and cleaner codebases since they are usually designed from scratch. Hence, we expect significant differences in micro-architectural behavior when running OLTP on platforms optimized for in-memory processing as opposed to disk-based database systems. In particular, we expect that in-memory systems exploit micro-architectural features such as instruction and data caches significantly better than disk-based systems. This paper sheds light on the micro-architectural behavior of in-memory database systems by analyzing and contrasting it to the behavior of disk-based systems when running OLTP workloads. The results show that, despite all the design changes, in-memory OLTP exhibits very similar micro-architectural behavior to disk-based OLTP: more than half of the execution time goes to memory stalls where instruction cache misses or the long-latency data misses from the last-level cache (LLC) are the dominant factors in the overall execution time. Even though ground-up designed in-memory systems can eliminate the instruction cache misses, the reduction in instruction stalls amplifies the impact of LLC data misses. As a result, only 30% of the CPU cycles are used to retire instructions, and 70% of the CPU cycles are wasted to stalls for both traditional disk-based and new generation in-memory OLTP.


2019 ◽  
Vol 9 (19) ◽  
pp. 4103 ◽  
Author(s):  
Hema Sekhar Reddy Rajula ◽  
Veronika Odintsova ◽  
Mirko Manchia ◽  
Vassilios Fanos

Cohorts are instrumental for epidemiologically oriented observational studies. Cohort studies usually observe large groups of individuals for a specific period of time to identify the contributing factors to a specific outcome (for instance an illness) and create associations between risk factors and the outcome under study. In collaborative projects, federated data facilities are meta-database systems that are distributed across multiple locations that permit to analyze, combine, or harmonize data from different sources making them suitable for mega- and meta-analyses. The harmonization of data can increase the statistical power of studies through maximization of sample size, allowing for additional refined statistical analyses, which ultimately lead to answer research questions that could not be addressed while using a single study. Indeed, harmonized data can be analyzed through mega-analysis of raw data or fixed effects meta-analysis. Other types of data might be analyzed by e.g., random-effects meta-analyses or Bayesian evidence synthesis. In this article, we describe some methodological aspects related to the construction of a federated facility to optimize analyses of multiple datasets, the impact of missing data, and some methods for handling missing data in cohort studies.


Sign in / Sign up

Export Citation Format

Share Document