Formalizing the Mapping of UML Conceptual Schemas to Column-Oriented Databases

2018 ◽  
Vol 14 (3) ◽  
pp. 44-68 ◽  
Author(s):  
Fatma Abdelhedi ◽  
Amal Ait Brahim ◽  
Gilles Zurfluh

Nowadays, most organizations need to improve their decision-making process using Big Data. To achieve this, they have to store Big Data, perform an analysis, and transform the results into useful and valuable information. To perform this, it's necessary to deal with new challenges in designing and creating data warehouse. Traditionally, creating a data warehouse followed well-governed process based on relational databases. The influence of Big Data challenged this traditional approach primarily due to the changing nature of data. As a result, using NoSQL databases has become a necessity to handle Big Data challenges. In this article, the authors show how to create a data warehouse on NoSQL systems. They propose the Object2NoSQL process that generates column-oriented physical models starting from a UML conceptual model. To ensure efficient automatic transformation, they propose a logical model that exhibits a sufficient degree of independence so as to enable its mapping to one or more column-oriented platforms. The authors provide experiments of their approach using a case study in the health care field.

Information ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 241
Author(s):  
Geomar A. Schreiner ◽  
Denio Duarte ◽  
Ronaldo dos S. Melo

Several data-centric applications today produce and manipulate a large volume of data, the so-called Big Data. Traditional databases, in particular, relational databases, are not suitable for Big Data management. As a consequence, some approaches that allow the definition and manipulation of large relational data sets stored in NoSQL databases through an SQL interface have been proposed, focusing on scalability and availability. This paper presents a comparative analysis of these approaches based on an architectural classification that organizes them according to their system architectures. Our motivation is that wrapping is a relevant strategy for relational-based applications that intend to move relational data to NoSQL databases (usually maintained in the cloud). We also claim that this research area has some open issues, given that most approaches deal with only a subset of SQL operations or give support to specific target NoSQL databases. Our intention with this survey is, therefore, to contribute to the state-of-art in this research area and also provide a basis for choosing or even designing a relational-to-NoSQL data wrapping solution.


2020 ◽  
Author(s):  
Sahib Singh

NoSQL Databases are a form of non-relational databases whose primary purpose is to store and retrieve data. Due to recent advancements in cloud computing platforms and the emergence of Big Data, NoSQL Databases are more becoming popular than ever. In this paper we are going to understand and analyze the fundamental security features and the vulnerabilities of MongoDB and how it performs compared to relational databases on these fronts.


Author(s):  
Deepika Prakash

Three technologies—business intelligence, big data, and machine learning—developed independently and address different types of problems. Data warehouses have been used as systems for business intelligence, and NoSQL databases are used for big data. In this chapter, the authors explore the convergence of business intelligence and big data. Traditionally, a data warehouse is implemented on a ROLAP or MOLAP platform. Whereas MOLAP suffers from having propriety architecture, ROLAP suffers from the inherent disadvantages of RDBMS. In order to mitigate the drawbacks of ROLAP, the authors propose implementing a data warehouse on a NoSQL database. They choose Cassandra as their database. For this they start by identifying a generic information model that captures the requirements of the system to-be. They propose mapping rules that map the components of the information model to the Cassandra data model. They finally show a small implementation using an example.


2003 ◽  
pp. 88-120
Author(s):  
Tanguy Chateau ◽  
Cecile Leroy ◽  
Johanna W. Rahayu ◽  
David Taniar

The emerging use of object-relational databases with Web technologies has only recently begun. This chapter discusses a practical realization of an application using this technology. The aim is to show readers how to construct a full application from a design using object-oriented features up to the implementation. In this chapter, we highlight important or difficult stages with an emphasis on the mapping of object design into Oracle 8i and the use of stored procedures with the extended features for objects manipulation of Oracle 8i. This enables developers to construct professional Web applications achieving a high modularity and evolution capacity with an accelerated development phase in comparison with the traditional approach.


The chapter presents how relational databases answer to typical NoSQL features, and, vice versa, how NoSQL databases answer to typical relational features. Open issues related to the integration of relational and NoSQL databases, as well as next database generation features are discussed. The big relational database vendors have continuously worked to incorporate NoSQL features into their databases, as well as NoSQL vendors are trying to make their products more like relational databases. The convergence of these two groups of databases has been a driving force in the evolution of database market, in establishing a new level of focus to resolving big data requirements, and in enabling users to fully use data potential, wherever data is stored, in relational or NoSQL databases. In turn, the database of choice in the future will likely be one that provides the best of both worlds: flexible data model, high availability, and enterprise reliability.


2018 ◽  
Vol 29 (2) ◽  
pp. 592-628 ◽  
Author(s):  
Ginevra Gravili ◽  
Marco Benvenuto ◽  
Alexandru Avram ◽  
Carmine Viola

PurposeThe purpose of this paper is to examine the influence of the Digital Divide (DD) and digital alphabetization (DA) on the Big Data (BD) generation process, to gain insight into how BD could become a useful tool in the decision-making process of supply chain management (SCM). Similarly, the paper aims to recognize and understand, from a value-creation perspective, the correlation between DD and BD generation and between DD and SCM.Design/methodology/approachThe approach utilized in the present study consists of two steps: first, a systematic literature review was conducted aiming at finding out to determine the existing relationship between “Big Data Analytics” (BDA), “SCM” and the “DD”. A total of 595 articles were considered, and analysis showed a clear relationship among BDA, SCM, and DD. Next, the Vector autoregressive (VAR) approach was applied in a case study to prove the correlation between DD (as part of internet usage) and internet acquisitions, and in general terms the relationship between DD and Trade. Internet usage and internet acquisition in imports and exports at the European level were considered as variables in an empirical study of European trade. The novelty of this two-tiered approach consists in its application of a systematic literature review, the first of its kind, to generate inputs for the longitudinal case study of imports and exports at the EU level. In turn, the case study tested the accuracy of the theorized relationship among the main variables.FindingsBy analyzing the connection between DD and internet acquisitions, a positive and long-lasting impulse response function was revealed, followed by an ascending trend. This suggests that a self-multiplying effect is being generated, and it is reasonable to assume that the more individuals use the internet, the more electronic acquisitions occur. We can thus reasonably conclude that the improvement of the BD and SCM process is strongly dependent on the quality of the human factor. Tackling DA is the new reading key in the decision-making process: quantifying the added value of the human factor in SCM is challenging and is an ongoing process, based on the opportunity cost between automation in decision-making or relying on the complexity of human factors.Research limitations/implicationsOne of the biggest limits in our research is the lack of the time series available on consumer orientations and preferences. Data on the typology of customer preferences, and how they are shaped, modified, or altered, were non-accessible, though big companies may have access to this data. The present longitudinal study on European trade helps clarify how and to what extent BDA, SCM, and DD are inter-related. The modeling of the theoretical framework likewise highlights several identifiable benefits for companies of adopting BDA in their business processes.Practical implicationsUnderstanding the obstacles to DD in trade companies and states, and identifying their influence on firm performance, serves to orient the decision-making process in SCM toward reducing DD to generate important economic benefits. Enhancing internet usage may accelerate longer-term investments in human resources, offering developing countries unprecedented opportunities to enhance their educational systems and to improve their economic policies, widening the range of opportunities for businesses and poor states.Social implicationsBD generation will undeniably influence microeconomic decisions: they will become evaluation tools of more efficient economic progress in small and/or large economies. However, an economically efficient society will be achievable only in those countries in which qualified human resources can generate and manage BD, to unlock its potential. This twofold effect will surely affect the socio-economic and geopolitical situation. The economic progress of conventional countries may vacillate if it is not adequately flanked by qualified human resources able to progress the information and communication technology (ICT) prevalent in contemporary economies. Consequently, the social impact of investments in ICT capacity building will necessarily affect future socio-economic scenarios. New indicators will become necessary to measure the conventional progress, and one of them will surely be DD.Originality/valueThe novelty of the present study is twofold: first, it is the first meticulous meta-analysis developed using a very wide analysis of the published literature to highlight a previously hidden relationship among DD, BD, and SCM. This comparative approach made it possible to build a theoretical framework for the real evaluation of the impact of BDA on different organizational elements, including SCM. Second, the research emphasizes the need to reform and reshape the studies on BDA, convincing companies that it is necessary to understand that the obstacles (DD and DA, i.e. internet usage) must be addressed with conscious decision-making processes, strategically and resolutely, to transform points of weakness into opportunities.


2018 ◽  
Vol 7 (2.6) ◽  
pp. 83
Author(s):  
Gourav Bathla ◽  
Rinkle Rani ◽  
Himanshu Aggarwal

Big data is a collection of large scale of structured, semi-structured and unstructured data. It is generated due to Social networks, Business organizations, interaction and views of social connected users. It is used for important decision making in business and research organizations. Storage which is efficient to process this large scale of data to extract important information in less response time is the need of current competitive time. Relational databases which have ruled the storage technology for such a long time seems not suitable for mixed types of data. Data can not be represented just in the form of rows and columns in tables. NoSQL (Not only SQL) is complementary to SQL technology which can provide various formats for storage that can be easily compatible with high velocity,large volume and different variety of data. NoSQL databases are categorized in four techniques- Column oriented, Key Value based, Graph based and Document oriented databases. There are approximately 120 real solutions existing for these categories; most commonly used solutions are elaborated in Introduction section. Several research works have been carried out to analyze these NoSQL technology solutions. These studies have not mentioned the situations in which a particular data storage technique is to be chosen. In this study and analysis, we have tried our best to provide answer on technology selection based on specific requirement to the reader. In previous research, comparisons amongNoSQL data storage techniques have been described by using real examples like MongoDB, Neo4J etc. Our observation is that if users have adequate knowledge of NoSQL categories and their comparison, then it is easy for them to choose best suitable category and then real solutions can be selected from this category.


2015 ◽  
Vol 2015 ◽  
pp. 1-15 ◽  
Author(s):  
Alexandre G. de Brevern ◽  
Jean-Philippe Meyniel ◽  
Cécile Fairhead ◽  
Cécile Neuvéglise ◽  
Alain Malpertuy

Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0255562
Author(s):  
Eman Khashan ◽  
Ali Eldesouky ◽  
Sally Elghamrawy

The growing popularity of big data analysis and cloud computing has created new big data management standards. Sometimes, programmers may interact with a number of heterogeneous data stores depending on the information they are responsible for: SQL and NoSQL data stores. Interacting with heterogeneous data models via numerous APIs and query languages imposes challenging tasks on multi-data processing developers. Indeed, complex queries concerning homogenous data structures cannot currently be performed in a declarative manner when found in single data storage applications and therefore require additional development efforts. Many models were presented in order to address complex queries Via multistore applications. Some of these models implemented a complex unified and fast model, while others’ efficiency is not good enough to solve this type of complex database queries. This paper provides an automated, fast and easy unified architecture to solve simple and complex SQL and NoSQL queries over heterogeneous data stores (CQNS). This proposed framework can be used in cloud environments or for any big data application to automatically help developers to manage basic and complicated database queries. CQNS consists of three layers: matching selector layer, processing layer, and query execution layer. The matching selector layer is the heart of this architecture in which five of the user queries are examined if they are matched with another five queries stored in a single engine stored in the architecture library. This is achieved through a proposed algorithm that directs the query to the right SQL or NoSQL database engine. Furthermore, CQNS deal with many NoSQL Databases like MongoDB, Cassandra, Riak, CouchDB, and NOE4J databases. This paper presents a spark framework that can handle both SQL and NoSQL Databases. Four scenarios’ benchmarks datasets are used to evaluate the proposed CQNS for querying different NoSQL Databases in terms of optimization process performance and query execution time. The results show that, the CQNS achieves best latency and throughput in less time among the compared systems.


Sign in / Sign up

Export Citation Format

Share Document