Perspective of anomaly detection in big data for data quality improvement

Policing is increasingly being shaped by data collection and analysis. However, we still know little about the quality of the data police services acquire and utilize. Drawing on a survey of analysts from across Canada, this article examines several data collection, analysis, and quality issues. We argue that as we move towards an era of big data policing it is imperative that police services pay more attention to the quality of the data they collect. We conclude by discussing the implications of ignoring data quality issues and the need to develop a more robust research culture in policing.

Download Full-text

Big Data Quality Assurance Through Data Traceability: A Case Study of the National Standard Reference Data Program of Korea

IEEE Access ◽

10.1109/access.2019.2904286 ◽

2019 ◽

Vol 7 ◽

pp. 36294-36299 ◽

Cited By ~ 1

Author(s):

Doyoung Lee

Keyword(s):

Quality Assurance ◽

Big Data ◽

Data Quality ◽

Standard Reference Data ◽

Reference Data ◽

National Standard ◽

Data Program

Download Full-text

Big data quality framework: a holistic approach to continuous quality management

Journal Of Big Data ◽

10.1186/s40537-021-00468-0 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Ikbal Taleb ◽

Mohamed Adel Serhani ◽

Chafik Bouhaddioui ◽

Rachida Dssouli

Keyword(s):

Big Data ◽

Quality Management ◽

Data Quality ◽

Value Added ◽

Holistic Approach ◽

Research Area ◽

Heterogeneous Data ◽

Data Generation ◽

Continuous Quality ◽

Quality Profile

AbstractBig Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper.

Download Full-text

Missing Observations and Data Quality Improvement

Handling Missing Data in Ranked Set Sampling - SpringerBriefs in Statistics ◽

10.1007/978-3-642-39899-5_1 ◽

2013 ◽

pp. 1-6

Author(s):

Carlos N. Bouza-Herrera

Keyword(s):

Quality Improvement ◽

Data Quality ◽

Missing Observations

Download Full-text

Application of Machine Learning for Oilfield Data Quality Improvement

10.2118/191601-18rptc-ms ◽

2018 ◽

Cited By ~ 1

Author(s):

Alla Andrianova ◽

Maxim Simonov ◽

Dmitry Perets ◽

Andrey Margarit ◽

Darya Serebryakova ◽

...

Keyword(s):

Machine Learning ◽

Quality Improvement ◽

Data Quality

Download Full-text

Drilling Data Quality Management: Case Study With a Laboratory Scale Drilling Rig

Volume 8: Polar and Arctic Sciences and Technology; Petroleum Technology ◽

10.1115/omae2018-77510 ◽

2018 ◽

Cited By ~ 1

Author(s):

Suranga C. H. Geekiyanage ◽

Dan Sui ◽

Bernt S. Aadnoy

Keyword(s):

Decision Making ◽

Quality Improvement ◽

Control System ◽

Data Analysis ◽

Quality Management ◽

Data Quality ◽

Incomplete Data ◽

Laboratory Scale ◽

Data Quality Management ◽

Drilling Rig

Drilling industry operations heavily depend on digital information. Data analysis is a process of acquiring, transforming, interpreting, modelling, displaying and storing data with an aim of extracting useful information, so that the decision-making, actions executing, events detecting and incident managing of a system can be handled in an efficient and certain manner. This paper aims to provide an approach to understand, cleanse, improve and interpret the post-well or realtime data to preserve or enhance data features, like accuracy, consistency, reliability and validity. Data quality management is a process with three major phases. Phase I is an evaluation of pre-data quality to identify data issues such as missing or incomplete data, non-standard or invalid data and redundant data etc. Phase II is an implementation of different data quality managing practices such as filtering, data assimilation, and data reconciliation to improve data accuracy and discover useful information. The third and final phase is a post-data quality evaluation, which is conducted to assure data quality and enhance the system performance. In this study, a laboratory-scale drilling rig with a control system capable of drilling is utilized for data acquisition and quality improvement. Safe and efficient performance of such control system heavily relies on quality of the data obtained while drilling and its sufficient availability. Pump pressure, top-drive rotational speed, weight on bit, drill string torque and bit depth are available measurements. The data analysis is challenged by issues such as corruption of data due to noises, time delays, missing or incomplete data and external disturbances. In order to solve such issues, different data quality improvement practices are applied for the testing. These techniques help the intelligent system to achieve better decision-making and quicker fault detection. The study from the laboratory-scale drilling rig clearly demonstrates the need for a proper data quality management process and clear understanding of signal processing methods to carry out an intelligent digitalization in oil and gas industry.

Download Full-text