Context-based Data Quality Metrics in Data Warehouse Systems

The fact that Data Quality (DQ) depends on the context, in which data are produced, stored and used, is widely recognized in the research community. Data Warehouse Systems (DWS), whose main goal is to give support to decision making based on data, have had a huge growth in the last years, in research and industry. DQ in this kind of systems becomes essential. This work presents a proposal for identifying DQ problems in the domain of DWS, considering the different contexts that exist in each system component. This proposal may act as a first conceptual framework that guides the DQ-responsible in the management of DQ in DWS. The main contributions of this work are a thorough literature review about how contexts are used for evaluating DQ in DWS, and a proposal for assessing DQ in DWS through context-based DQ metrics.

Download Full-text

Specification for data quality metrics for industrial measurement and control systems

10.3403/02317420 ◽

2001 ◽

Keyword(s):

Data Quality ◽

Control Systems ◽

Quality Metrics ◽

Data Quality Metrics ◽

And Control ◽

Measurement And Control ◽

Industrial Measurement

Download Full-text

Design of Marine Data Warehouse ETL System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.668-669.1374 ◽

2014 ◽

Vol 668-669 ◽

pp. 1374-1377 ◽

Cited By ~ 1

Author(s):

Wei Jun Wen

Keyword(s):

Decision Making ◽

Data Quality ◽

Data Warehouse ◽

Environmental Data ◽

Data Sources ◽

Quality Data ◽

Critical Step ◽

Marine Data ◽

Data Extracting ◽

Marine Environmental

ETL refers to the process of data extracting, transformation and loading and is deemed as a critical step in ensuring the quality, data specification and standardization of marine environmental data. Marine data, due to their complication, field diversity and huge volume, still remain decentralized, polyphyletic and isomerous with different semantics and hence far from being able to provide effective data sources for decision making. ETL enables the construction of marine environmental data warehouse in the form of cleaning, transformation, integration, loading and periodic updating of basic marine data warehouse. The paper presents a research on rules for cleaning, transformation and integration of marine data, based on which original ETL system of marine environmental data warehouse is so designed and developed. The system further guarantees data quality and correctness in analysis and decision-making based on marine environmental data in the future.

Download Full-text

An Algebraic Approach to Data Quality Metrics for Entity Resolution over Large Datasets

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch196 ◽

2008 ◽

pp. 3067-3084

Author(s):

John Talburt ◽

Richard Wang ◽

Kimberly Hess ◽

Emily Kuo

Keyword(s):

Data Quality ◽

Algebraic Approach ◽

Entity Resolution ◽

Quality Metrics ◽

Quality Literature ◽

Intrinsic Quality ◽

Recognition Systems ◽

Entity Identification ◽

Data Quality Metrics

This chapter introduces abstract algebra as a means of understanding and creating data quality metrics for entity resolution, the process in which records determined to represent the same real-world entity are successively located and merged. Entity resolution is a particular form of data mining that is foundational to a number of applications in both industry and government. Examples include commercial customer recognition systems and information sharing on “persons of interest” across federal intelligence agencies. Despite the importance of these applications, most of the data quality literature focuses on measuring the intrinsic quality of individual records than the quality of record grouping or integration. In this chapter, the authors describe current research into the creation and validation of quality metrics for entity resolution, primarily in the context of customer recognition systems. The approach is based on an algebraic view of the system as creating a partition of a set of entity records based on the indicative information for the entities in question. In this view, the relative quality of entity identification between two systems can be measured in terms of the similarity between the partitions they produce. The authors discuss the difficulty of applying statistical cluster analysis to this problem when the datasets are large and propose an alternative index suitable for these situations. They also report some preliminary experimental results, and outlines areas and approaches to further research in this area.

Download Full-text

Prioritizing technical debt in database normalization using portfolio theory and data quality metrics

Proceedings of the 2018 International Conference on Technical Debt - TechDebt '18 ◽

10.1145/3194164.3194170 ◽

2018 ◽

Cited By ~ 2

Author(s):

Mashel Albarak ◽

Rami Bahsoon

Keyword(s):

Data Quality ◽

Portfolio Theory ◽

Quality Metrics ◽

Technical Debt ◽

Data Quality Metrics ◽

Database Normalization

Download Full-text

Visual Interactive Creation, Customization, and Analysis of Data Quality Metrics

Journal of Data and Information Quality ◽

10.1145/3190578 ◽

2018 ◽

Vol 10 (1) ◽

pp. 1-26 ◽

Cited By ~ 2

Author(s):

Christian Bors ◽

Theresia Gschwandtner ◽

Simone Kriglstein ◽

Silvia Miksch ◽

Margit Pohl

Keyword(s):

Data Quality ◽

Quality Metrics ◽

Data Quality Metrics

Download Full-text

Multiple Switching and Data Quality in the Multiple Price List

Review of Economics and Statistics ◽

10.1162/rest_a_00895 ◽

2020 ◽

pp. 1-15

Author(s):

Chi Wai Yu ◽

Y. Jane Zhang ◽

Sharon Xuejing Zuo

Keyword(s):

Decision Making ◽

Data Quality ◽

Conceptual Framework ◽

Cognitive Ability ◽

Risky Choice ◽

Switching Behavior ◽

Price List ◽

High Cognitive Ability ◽

Multiple Switching ◽

Quality Decision

A substantial proportion of individuals who complete the widely used multiple price list (MPL) instrument switch back and forth between the safe and the risky choice columns, behavior that is believed to indicate lowquality decision making. We develop a conceptual framework to formally define decision-making quality, test explanations for the nature of low-quality decision making, and introduce a novel “nudge” treatment that reduced multiple switching behavior and increased decision-making quality. We find evidence in support of task-specific miscomprehension of the MPL and that nonmultiple switchers and relatively high-cognitive-ability individuals are not immune to low-quality decision making.

Download Full-text

Recommendations for Mass Spectrometry Data Quality Metrics for Open Access Data (Corollary to the Amsterdam Principles)

Journal of Proteome Research ◽

10.1021/pr201071t ◽

2011 ◽

Vol 11 (2) ◽

pp. 1412-1419 ◽

Cited By ~ 21

Author(s):

Christopher R. Kinsinger ◽

James Apffel ◽

Mark Baker ◽

Xiaopeng Bian ◽

Christoph H. Borchers ◽

...

Keyword(s):

Mass Spectrometry ◽

Open Access ◽

Data Quality ◽

Quality Metrics ◽

Mass Spectrometry Data ◽

Open Access Data ◽

Data Quality Metrics ◽

Access Data

Download Full-text

Investment Cost Model in Business Process Intelligence in Banking And Electricity Company

ComTech Computer Mathematics and Engineering Applications ◽

10.21512/comtech.v7i2.2248 ◽

2016 ◽

Vol 7 (2) ◽

pp. 113

Author(s):

Arta Moro Sundjaja

Keyword(s):

Decision Making ◽

Literature Review ◽

Data Warehouse ◽

Business Process ◽

Cost Model ◽

Process Performance ◽

Design Development ◽

Business Process Intelligence ◽

Investment Cost ◽

Decision Making Processes

Higher demand from the top management in measuring business process performance causes the incremental implementation of BPM and BI in the enterprise. The problem faced by top managements is how to integrate their data from all system used to support the business and process the data become information that able to support the decision-making processes. Our literature review elaborates several implementations of BPI on companies in Australia and Germany, challenges faced by organizations in developing BPI solution in their organizations and some cost model to calculate the investment of BPI solutions. This paper shows the success in BPI application of banks and assurance companies in German and electricity work in Australia aims to give a vision about the importance of BPI application. Many challenges in BPI application of companies in German and Australia, BPI solution, and data warehouse design development have been discussed to add insight in future BPI development. And the last is an explanation about how to analyze cost associated with BPI solution investment.

Download Full-text