Data quality: Experiences and lessons from operationalizing big data

2021 ◽

pp. 0032258X2110214

Author(s):

Christopher D O’Connor ◽

John Ng ◽

Dallas Hill ◽

Tyler Frederick

Keyword(s):

Big Data ◽

Data Collection ◽

Data Quality ◽

Research Culture ◽

Police Services ◽

Police Data ◽

Data Collection And Analysis ◽

Quality Issues

Policing is increasingly being shaped by data collection and analysis. However, we still know little about the quality of the data police services acquire and utilize. Drawing on a survey of analysts from across Canada, this article examines several data collection, analysis, and quality issues. We argue that as we move towards an era of big data policing it is imperative that police services pay more attention to the quality of the data they collect. We conclude by discussing the implications of ignoring data quality issues and the need to develop a more robust research culture in policing.

Download Full-text

Perspective of anomaly detection in big data for data quality improvement

Materials Today Proceedings ◽

10.1016/j.matpr.2021.05.597 ◽

2021 ◽

Author(s):

Vinaya Keskar ◽

Jyoti Yadav ◽

Ajay Kumar

Keyword(s):

Big Data ◽

Quality Improvement ◽

Anomaly Detection ◽

Data Quality

Download Full-text

Big Data Quality Assurance Through Data Traceability: A Case Study of the National Standard Reference Data Program of Korea

IEEE Access ◽

10.1109/access.2019.2904286 ◽

2019 ◽

Vol 7 ◽

pp. 36294-36299 ◽

Cited By ~ 1

Author(s):

Doyoung Lee

Keyword(s):

Quality Assurance ◽

Big Data ◽

Data Quality ◽

Standard Reference Data ◽

Reference Data ◽

National Standard ◽

Data Program

Download Full-text

Big data quality framework: a holistic approach to continuous quality management

Journal Of Big Data ◽

10.1186/s40537-021-00468-0 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Ikbal Taleb ◽

Mohamed Adel Serhani ◽

Chafik Bouhaddioui ◽

Rachida Dssouli

Keyword(s):

Big Data ◽

Quality Management ◽

Data Quality ◽

Value Added ◽

Holistic Approach ◽

Research Area ◽

Heterogeneous Data ◽

Data Generation ◽

Continuous Quality ◽

Quality Profile

AbstractBig Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper.

Download Full-text

Scene-Based Big Data Quality Management Framework

Communications in Computer and Information Science - Data Science ◽

10.1007/978-981-13-2203-7_10 ◽

2018 ◽

pp. 122-139

Author(s):

Xinhua Dong ◽

Heng He ◽

Chao Li ◽

Yongchuan Liu ◽

Houbo Xiong

Keyword(s):

Big Data ◽

Quality Management ◽

Data Quality ◽

Data Quality Management ◽

Management Framework

Download Full-text

Relating Big Data and Data Quality in Financial Service Organizations

Lecture Notes in Computer Science - Challenges and Opportunities in the Digital Era ◽

10.1007/978-3-030-02131-3_45 ◽

2018 ◽

pp. 504-519

Author(s):

Agung Wahyudi ◽

Adiska Farhani ◽

Marijn Janssen

Keyword(s):

Big Data ◽

Data Quality ◽

Financial Service ◽

Service Organizations

Download Full-text

Making citizen science newsworthy in the era of big data

Journal of Science Communication ◽

10.22323/2.16020305 ◽

2017 ◽

Vol 16 (02) ◽

pp. C05

Author(s):

Stuart Allan ◽

Joanna Redden

Keyword(s):

Big Data ◽

Data Quality ◽

Citizen Science ◽

Historical Context ◽

Early Interventions ◽

Science Journalism ◽

Potential Risks ◽

Transparency And Accountability

This article examines certain guiding tenets of science journalism in the era of big data by focusing on its engagement with citizen science. Having placed citizen science in historical context, it highlights early interventions intended to help establish the basis for an alternative epistemological ethos recognising the scientist as citizen and the citizen as scientist. Next, the article assesses further implications for science journalism by examining the challenges posed by big data in the realm of citizen science. Pertinent issues include potential risks associated with data quality, access dynamics, the difficulty investigating algorithms, and concerns about certain constraints impacting on transparency and accountability.

Download Full-text

The Cross-agency Sharing Scheme and Data Quality Evaluation —— A Case study of Geo-spatial big Data

Journal of Physics Conference Series ◽

10.1088/1742-6596/1955/1/012014 ◽

2021 ◽

Vol 1955 (1) ◽

pp. 012014

Author(s):

Yi Bai ◽

Song Li ◽

Wei Zhang

Keyword(s):

Big Data ◽

Data Quality ◽

Quality Evaluation ◽

Sharing Scheme ◽

Spatial Big Data ◽

The Cross

Download Full-text

Big data to knowledge – Harnessing semiotic relationships of data quality and skills in genome curation work

Journal of Information Science ◽

10.1177/0165551517748291 ◽

2018 ◽

Vol 44 (6) ◽

pp. 785-801

Author(s):

Hong Huang

Keyword(s):

Quality Assurance ◽

Big Data ◽

Data Quality ◽

Strong Support ◽

Policy Makers ◽

Specific Data ◽

Theoretical Perspectives ◽

Funding Agencies ◽

Quality Dimensions ◽

Data Quality Dimensions

This article aims to understand the views of genomic scientists with regard to the data quality assurances associated with semiotics and data–information–knowledge (DIK). The resulting communication of signs generated from genomic curation work, was found within different semantic levels of DIK that correlate specific data quality dimensions with their respective skills. Syntactic data quality dimensions were ranked the highest among all other semiotic data quality dimensions, which indicated that scientists spend great efforts for handling data wrangling activities in genome curation work. Semantic- and pragmatic-related sign communications were about meaningful interpretation, thus required additional adaptive and interpretative skills to deal with data quality issues. This expanded concept of ‘curation’ as sign/semiotic was not previously explored from the practical to the theoretical perspectives. The findings inform policy makers and practitioners to develop framework and cyberinfrastructure that facilitate the initiatives and advocacies of ‘Big Data to Knowledge’ by funding agencies. The findings from this study can also help plan data quality assurance policies and thus maximise the efficiency of genomic data management. Our results give strong support to the relevance of data quality skills communication for relationship with data quality assurance in genome curation activities.

Download Full-text

Data Quality Associated with Big Data Processing: A Survey

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/05386 ◽

2021 ◽

Vol 23 (06) ◽

pp. 1011-1018

Author(s):

Aishrith P Rao ◽

◽

Raghavendra J C ◽

Dr. Sowmyarani C N ◽

Dr. Padmashree T ◽

...

Keyword(s):

Big Data ◽

Data Quality ◽

Data Gathering ◽

Cost Effective ◽

Data Repository ◽

Critical Approach ◽

Critical Aspect ◽

Quality Of Data ◽

Data Group

With the advancement of technology and the large volume of data produced, processed, and stored, it is becoming increasingly important to maintain the quality of data in a cost-effective and productive manner. The most important aspects of Big Data (BD) are storage, processing, privacy, and analytics. The Big Data group has identified quality as a critical aspect of its maturity. Nonetheless, it is a critical approach that should be adopted early in the lifecycle and gradually extended to other primary processes. Companies are very reliant and drive profits from the huge amounts of data they collect. When its consistency deteriorates, the ramifications are uncertain and may result in completely undesirable conclusions. In the sense of BD, determining data quality is difficult, but it is essential that we uphold the data quality before we can proceed with any analytics. We investigate data quality during the stages of data gathering, preprocessing, data repository, and evaluation/analysis of BD processing in this paper. The related solutions are also suggested based on the elaboration and review of the proposed problems.

Download Full-text

Data quality: Experiences and lessons from operationalizing big data

Thinking about police data: Analysts’ perceptions of data quality in Canadian policing

Perspective of anomaly detection in big data for data quality improvement

Big Data Quality Assurance Through Data Traceability: A Case Study of the National Standard Reference Data Program of Korea

Big data quality framework: a holistic approach to continuous quality management

Scene-Based Big Data Quality Management Framework

Relating Big Data and Data Quality in Financial Service Organizations

Making citizen science newsworthy in the era of big data

The Cross-agency Sharing Scheme and Data Quality Evaluation —— A Case study of Geo-spatial big Data

Big data to knowledge – Harnessing semiotic relationships of data quality and skills in genome curation work

Data Quality Associated with Big Data Processing: A Survey

Export Citation Format