Domain/Mapping Model: A Novel Data Warehouse Data Mode

In order for a data warehouse to be able to adequately fulfill its integrative and historical purpose, its data model must enable the appropriate and consistent representation of the different states of a system. In effect, a DW data model, representing the physical structure of the DW, must be general enough, to be able to consume data from heterogeneous data sources and reconcile the semantic differences of the data source models, and, at the same time, be resilient to the constant changes in the structure of the data sources. One of the main problems related to DW development is the absence of a standardized DW data model. In this paper a comparative analysis of the four most prominent DW data models (namely the relational/normalized model, data vault model, anchor model and dimensional model) will be given. On the basis of the results of [1]a, the new DW data model (the Domain/Mapping model- DMM) which would more adequately fulfill the posed requirements is presented.

Download Full-text

Temporal Semistructured Data Models and Data Warehouses

Data Warehouses and OLAP ◽

10.4018/987-1-59904-364-7.ch012 ◽

2011 ◽

pp. 277-297 ◽

Cited By ~ 2

Author(s):

Carlo Combi ◽

Barbara Oliboni

Keyword(s):

Data Warehouse ◽

Data Model ◽

Heterogeneous Data ◽

Data Models ◽

Semistructured Data ◽

Data Sources ◽

Time Varying ◽

Data Warehouses ◽

Time Dimension ◽

Heterogeneous Data Sources

This chapter describes a graph-based approach to represent information stored in a data warehouse, by means of a temporal semistructured data model. We consider issues related to the representation of semistructured data warehouses, and discuss the set of constraints needed to manage in a correct way the warehouse time, i.e. the time dimension considered storing data in the data warehouse itself. We use a temporal semistructured data model because a data warehouse can contain data coming from different and heterogeneous data sources. This means that data stored in a data warehouse are semistructured in nature, i.e. in different documents the same information can be represented in different ways, and moreover, the document schemata can be available or not. Moreover, information stored into a data warehouse is often time varying, thus as for semistructured data, also in the data warehouse context, it could be useful to consider time.

Download Full-text

A Survey of Data Warehouse Model Evolution

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch015 ◽

2009 ◽

pp. 129-136

Author(s):

Cécile Favre ◽

Fadila Bentayeb ◽

Omar Boussaid

Keyword(s):

Data Warehouse ◽

Data Warehousing ◽

Business Environment ◽

Heterogeneous Data ◽

Data Sources ◽

Future Trends ◽

Model Evolution ◽

Heterogeneous Data Sources ◽

Key Points ◽

Existing Data

A data warehouse allows the integration of heterogeneous data sources for analysis purposes. One of the key points for the success of the data warehousing process is the design of the model according to the available data sources and the analysis needs (Nabli, Soussi, Feki, Ben-Abdallah & Gargouri, 2005). However, as the business environment evolves, several changes in the content and structure of the underlying data sources may occur. In addition to these changes, analysis needs may also evolve, requiring an adaptation to the existing data warehouse’s model. In this chapter, we provide an overall view of the state of the art in data warehouse model evolution. We present a set of comparison criteria and compare the various works. Moreover, we discuss the future trends in data warehouse model evolution.

Download Full-text

Integrate inconsistent and heterogeneous data based on user feedback

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-04-2014-0013 ◽

2015 ◽

Vol 8 (2) ◽

pp. 187-203 ◽

Cited By ~ 2

Author(s):

Lihua Lu ◽

Hengzhen Zhang ◽

Xiao-Zhi Gao

Keyword(s):

Data Integration ◽

Data Model ◽

Quality Criteria ◽

Heterogeneous Data ◽

Data Sources ◽

Content Type ◽

Inconsistent Data ◽

Data Inconsistency ◽

Multi Attribute Decision Making ◽

Data Source

Purpose – Data integration is to combine data residing at different sources and to provide the users with a unified interface of these data. An important issue on data integration is the existence of conflicts among the different data sources. Data sources may conflict with each other at data level, which is defined as data inconsistency. The purpose of this paper is to aim at this problem and propose a solution for data inconsistency in data integration. Design/methodology/approach – A relational data model extended with data source quality criteria is first defined. Then based on the proposed data model, a data inconsistency solution strategy is provided. To accomplish the strategy, fuzzy multi-attribute decision-making (MADM) approach based on data source quality criteria is applied to obtain the results. Finally, users feedbacks strategies are proposed to optimize the result of fuzzy MADM approach as the final data inconsistent solution. Findings – To evaluate the proposed method, the data obtained from the sensors are extracted. Some experiments are designed and performed to explain the effectiveness of the proposed strategy. The results substantiate that the solution has a better performance than the other methods on correctness, time cost and stability indicators. Practical implications – Since the inconsistent data collected from the sensors are pervasive, the proposed method can solve this problem and correct the wrong choice to some extent. Originality/value – In this paper, for the first time the authors study the effect of users feedbacks on integration results aiming at the inconsistent data.

Download Full-text

CLINICAL DATA WAREHOUSE: A REVIEW

Iraqi Journal for Computers and Informatics ◽

10.25195/ijci.v44i2.53 ◽

2018 ◽

Vol 44 (2) ◽

pp. 16-26 ◽

Cited By ~ 1

Author(s):

Alaa Hamoud ◽

Ali Hashim ◽

Wid Awadh

Keyword(s):

Data Warehouse ◽

Clinical Data ◽

Clinical Information ◽

Heterogeneous Data ◽

Data Sources ◽

Security And Privacy ◽

Clinical Environment ◽

Privacy Concerns ◽

Clinical Data Warehouse ◽

Heterogeneous Data Sources

Clinical decisions are crucial because they are related to human lives. Thus, managers and decision makers inthe clinical environment seek new solutions that can support their decisions. A clinical data warehouse (CDW) is animportant solution that is used to achieve clinical stakeholders’ goals by merging heterogeneous data sources in a centralrepository and using this repository to find answers related to the strategic clinical domain, thereby supporting clinicaldecisions. CDW implementation faces numerous obstacles, starting with the data sources and ending with the tools thatview the clinical information. This paper presents a systematic overview of purpose of CDWs as well as the characteristics;requirements; data sources; extract, transform and load (ETL) process; security and privacy concerns; design approach;architecture; and challenges and difficulties related to implementing a successful CDW. PubMed and Google Scholarare used to find papers related to CDW. Among the total of 784 papers, only 42 are included in the literature review. Thesepapers are classified based on five perspectives, namely methodology, data, system, ETL tool and purpose, to findinsights related to aspects of CDW. This review can contribute answers to questions related to CDW and providerecommendations for implementing a successful CDW.

Download Full-text

The Research of Integration Technology of Heterogeneous Data Sources on Campus

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.160.234 ◽

2012 ◽

Vol 160 ◽

pp. 234-239

Author(s):

Xiao Xiao Liang

Keyword(s):

Information Technology ◽

Information System ◽

Information Systems ◽

Heterogeneous Data ◽

Data Sources ◽

Key Technology ◽

Integration Technology ◽

Heterogeneous Data Sources ◽

Data Source Integration ◽

Data Source

With the deeply development of campus information technology, it needs to store and manage vast amounts of information. However, the information is often stored in different structures of information systems, how to achieve the integration of heterogeneous data sources is the problem to be solved in university information system. In this article, the most commonly used integration technology of heterogeneous data sources is described, and then data source integration patterns which could be used in campus are studied, the design of XML-based integration of heterogeneous data sources and some key technology are in-depth studied.

Download Full-text

Semantic Integration and Querying of Heterogeneous Data Sources Using a Hypergraph Data Model

Lecture Notes in Computer Science - Advances in Databases ◽

10.1007/3-540-45495-0_19 ◽

2002 ◽

pp. 166-182 ◽

Cited By ~ 4

Author(s):

Dimitri Theodoratos

Keyword(s):

Data Model ◽

Heterogeneous Data ◽

Semantic Integration ◽

Data Sources ◽

Heterogeneous Data Sources

Download Full-text

A relational data harmonization approach to XML

Journal of Information Science ◽

10.1177/0165551509104231 ◽

2009 ◽

Vol 35 (5) ◽

pp. 571-601 ◽

Cited By ~ 13

Author(s):

Timo Niemi ◽

Turkka Näppilä ◽

Kalervo Järvelin

Keyword(s):

Information Needs ◽

Ad Hoc ◽

Heterogeneous Data ◽

Data Sources ◽

Similar Data ◽

Xml Data ◽

Processing Style ◽

Heterogeneous Data Sources ◽

Autonomous Data Sources ◽

Data Source

There are numerous approaches for integrating data from heterogeneous data sources. A common background assumption is that the data sources remain quite stable and are known in advance. Hence an integration system can be built to manipulate them. In practice there is, however, often a demand for supporting ad hoc information needs concerning unexpected autonomous data sources containing volatile data. A different approach is therefore needed. We propose that semantically similar data are harmonized when extracting data from XML-based data sources. We introduce a constructor algebra, which is a powerful tool in the harmonization of XML data. This algebra is able to form for any XML data source a unique relational representation, called an XML relation. We demonstrate that the XML relation representation supports grouping and aggregation of data needed, for example, in OLAP (online analytical processing) -style applications.

Download Full-text