scholarly journals Models and data quality in information systems applicable in the mining industry

2021 ◽  
Vol 280 ◽  
pp. 08012
Author(s):  
Yordanka Anastasova ◽  
Nikolay Yanev

The purpose of this article is to present modern approaches to data storage and processing, as well as technologies to achieve the quality of data needed for specific purposes in the mining industry. The data format looks at NoSQL and NewSQL technologies, with the focus shifting from the use of common solutions (traditional RDBMS) to specific ones aimed at integrating data into industrial information systems. The information systems used in the mining industry are characterized by their specificity and diversity, which is a prerequisite for the integration of NoSQL data models in it due to their flexibility. In modern industrial information systems, data is considered high-quality if it actually reflects the described object and serves to make effective management decisions. The article also discusses the criteria for data quality from the point of view of information technology and that of its users. Technologies are also presented, providing an optimal set of necessary functions that ensure the desired quality of data in the information systems applicable in the industry. The format and quality of data in client-server based information systems is of particular importance, especially in the dynamics of data input and processing in information systems used in the mining industry.

Author(s):  
Benjamin Ngugi ◽  
Jafar Mana ◽  
Lydia Segal

As the nation confronts a growing tide of security breaches, the importance of having quality data breach information systems becomes paramount. Yet too little attention is paid to evaluating these systems. This article draws on data quality scholarship to develop a yardstick that assesses the quality of data breach notification systems in the U.S. at both the state and national levels from the perspective of key stakeholders, who include law enforcement agencies, consumers, shareholders, investors, researchers, and businesses that sell security products. Findings reveal major shortcomings that reduce the value of data breach information to these stakeholders. The study concludes with detailed recommendations for reform.


2018 ◽  
Vol 10 (1) ◽  
Author(s):  
Brian E. Dixon ◽  
Chen Wen ◽  
Tony French ◽  
Jennifer Williams ◽  
Shaun J. Grannis

ObjectiveTo extend an open source analytics and visualization platform for measuring the quality of electronic health data transmitted to syndromic surveillance systems.IntroductionEffective clinical and public health practice in the twenty-first century requires access to data from an increasing array of information systems. However, the quality of data in these systems can be poor or “unfit for use.” Therefore measuring and monitoring data quality is an essential activity for clinical and public health professionals as well as researchers1. Current methods for examining data quality largely rely on manual queries and processes conducted by epidemiologists. Better, automated tools for examining data quality are desired by the surveillance community.MethodsUsing the existing, open-source platform Atlas developed by the Observational Health Data Sciences and Informatics collaborative (OHDSI; www.ohdsi.org), we added new functionality to measure and visualize the quality of data electronically reported from disparate information systems. Our extensions focused on analysis of data reported electronically to public health agencies for disease surveillance. Specifically, we created methods for examining the completeness and timeliness of data reported as well as the information entropy of the data within syndromic surveillance messages sent from emergency department information systems.ResultsTo date we transformed 111 million syndromic surveillance message segments pertaining to 16.4 million emergency department encounters representing 6 million patients into the OHDSI common data model. We further measured completeness, timeliness and entropy of the syndromic surveillance data. In Figure-1, the OHDSI tool Atlas summarizes the analysis of data completeness for key fields in over one million syndromic surveillance messages sent to Indiana’s health department in 2014. Completeness is reported by age category (e.g., 0-10, 20-30, 60+). Gender is generally complete, but both race and ethnicity fields are often complete for less than half of the patients in the cohort. These results suggest areas for improvement with respect to data quality that could be actionable by the syndromic surveillance coordinator at the state health department.ConclusionsOur project remains a work-in-progress. While functions that assess completeness, timeliness and entropy are complete, there may be other functions important to public health that need to be developed. We are currently soliciting feedback from syndromic surveillance stakeholders to gather ideas for what other functions would be useful to epidemiologists. Suggestions could be developed into functions over the next year. We are further working with the OHDSI collaborative to distribute the Atlas enhancements to other platforms, including the National Syndromic Surveillance Platform (NSSP). Our goal is to enable epidemiologists to quickly analyze data quality at scale.References1. Dixon BE, Rosenman M, Xia Y, Grannis SJ. A vision for the systematic monitoring and improvement of the quality of electronic health data. Studies in health technology and informatics. 2013;192:884-8.


Author(s):  
Diego Milano

Data quality is a complex concept defined by various dimensions such as accuracy, currency, completeness, and consistency (Wang & Strong, 1996). Recent research has highlighted the importance of data quality issues in various contexts. In particular, in some specific environments characterized by extensive data replication high quality of data is a strict requirement. Among such environments, this article focuses on Cooperative Information Systems. Cooperative information systems (CISs) are all distributed and heterogeneous information systems that cooperate by sharing information, constraints, and goals (Mylopoulos & Papazoglou, 1997). Quality of data is a necessary requirement for a CIS. Indeed, a system in the CIS will not easily exchange data with another system without knowledge of the quality of data provided by the other system, thus resulting in a reduced cooperation. Also, when the quality of exchanged data is poor, there is a progressive deterioration of the overall data quality in the CIS. On the other hand, the high degree of data replication that characterizes a CIS can be exploited for improving data quality, as different copies of the same data may be compared in order to detect quality problems and possibly solve them. In Scannapieco, Virgillito, Marchetti, Mecella, and Baldoni (2004) and Mecella et al. (2003), the DaQuinCIS architecture is described as an architecture managing data quality in cooperative contexts, in order to avoid the spread of low-quality data and to exploit data replication for the improvement of the overall quality of cooperative data. In this article we will describe the design of a component of our system named as, quality factory. The quality factory has the purpose of evaluating quality of XML data sources of the cooperative system. While the need for such a component had been previously identified, this article first presents the design of the quality factory and proposes an overall methodology to evaluate the quality of XML data sources. Quality values measured by the quality factory are used by the data quality broker. The data quality broker has two main functionalities: 1) quality brokering that allows users to select data in the CIS according to their quality; 2) quality improvement that diffuses best quality copies of data in the CIS.


Author(s):  
Nishita Shewale

Abstract: To introduce unified information systems, this will provide different establishments with an insight on how data related activities take place and there results with assured quality. Considering data accumulation, replication, missing entities, incorrect formatting, anomalies etc. can come to light in the collection of data in different information systems, which can cause an array of adverse effects on data quality, the subject of data quality should be treated with better results. This paper inspects the data quality problems in information systems and introduces the new techniques that enable organizations to improve their quality of data. Keywords: Information Systems (IS), Data Quality, Data Cleaning, Data Profiling, Standardization, Database, Organization


2017 ◽  
Vol 4 (1) ◽  
pp. 25-31 ◽  
Author(s):  
Diana Effendi

Information Product Approach (IP Approach) is an information management approach. It can be used to manage product information and data quality analysis. IP-Map can be used by organizations to facilitate the management of knowledge in collecting, storing, maintaining, and using the data in an organized. The  process of data management of academic activities in X University has not yet used the IP approach. X University has not given attention to the management of information quality of its. During this time X University just concern to system applications used to support the automation of data management in the process of academic activities. IP-Map that made in this paper can be used as a basis for analyzing the quality of data and information. By the IP-MAP, X University is expected to know which parts of the process that need improvement in the quality of data and information management.   Index term: IP Approach, IP-Map, information quality, data quality. REFERENCES[1] H. Zhu, S. Madnick, Y. Lee, and R. Wang, “Data and Information Quality Research: Its Evolution and Future,” Working Paper, MIT, USA, 2012.[2] Lee, Yang W; at al, Journey To Data Quality, MIT Press: Cambridge, 2006.[3] L. Al-Hakim, Information Quality Management: Theory and Applications. Idea Group Inc (IGI), 2007.[4] “Access : A semiotic information quality framework: development and comparative analysis : Journal ofInformation Technology.” [Online]. Available: http://www.palgravejournals.com/jit/journal/v20/n2/full/2000038a.html. [Accessed: 18-Sep-2015].[5] Effendi, Diana, Pengukuran Dan Perbaikan Kualitas Data Dan Informasi Di Perguruan Tinggi MenggunakanCALDEA Dan EVAMECAL (Studi Kasus X University), Proceeding Seminar Nasional RESASTEK, 2012, pp.TIG.1-TI-G.6.


2021 ◽  
pp. 004912412199553
Author(s):  
Jan-Lucas Schanze

An increasing age of respondents and cognitive impairment are usual suspects for increasing difficulties in survey interviews and a decreasing data quality. This is why survey researchers tend to label residents in retirement and nursing homes as hard-to-interview and exclude them from most social surveys. In this article, I examine to what extent this label is justified and whether quality of data collected among residents in institutions for the elderly really differs from data collected within private households. For this purpose, I analyze the response behavior and quality indicators in three waves of Survey of Health, Ageing and Retirement in Europe. To control for confounding variables, I use propensity score matching to identify respondents in private households who share similar characteristics with institutionalized residents. My results confirm that most indicators of response behavior and data quality are worse in institutions compared to private households. However, when controlling for sociodemographic and health-related variables, differences get very small. These results suggest the importance of health for the data quality irrespective of the housing situation.


2008 ◽  
Vol 13 (5) ◽  
pp. 378-389 ◽  
Author(s):  
Xiaohua Douglas Zhang ◽  
Amy S. Espeseth ◽  
Eric N. Johnson ◽  
Jayne Chin ◽  
Adam Gates ◽  
...  

RNA interference (RNAi) not only plays an important role in drug discovery but can also be developed directly into drugs. RNAi high-throughput screening (HTS) biotechnology allows us to conduct genome-wide RNAi research. A central challenge in genome-wide RNAi research is to integrate both experimental and computational approaches to obtain high quality RNAi HTS assays. Based on our daily practice in RNAi HTS experiments, we propose the implementation of 3 experimental and analytic processes to improve the quality of data from RNAi HTS biotechnology: (1) select effective biological controls; (2) adopt appropriate plate designs to display and/or adjust for systematic errors of measurement; and (3) use effective analytic metrics to assess data quality. The applications in 5 real RNAi HTS experiments demonstrate the effectiveness of integrating these processes to improve data quality. Due to the effectiveness in improving data quality in RNAi HTS experiments, the methods and guidelines contained in the 3 experimental and analytic processes are likely to have broad utility in genome-wide RNAi research. ( Journal of Biomolecular Screening 2008:378-389)


Tunas Agraria ◽  
2021 ◽  
Vol 4 (2) ◽  
pp. 168-174
Author(s):  
Maslusatun Mawadah

The South Jakarta Administrative City Land Office is one of the cities targeted to be a city with complete land administration in 2020. The current condition of land parcel data demands an update, namely improving the quality of data from KW1 to KW6 towards KW1 valid. The purpose of this study is to determine the condition of land data quality in South Jakarta, the implementation of data quality improvement, as well as problems and solutions in implementing data quality improvement. The research method used is qualitative with a descriptive approach. The results showed that the condition of the data quality after the implementation of the improvement, namely KW1 increased from 86.45% to 87.01%. The roles of man, material, machine, and method have been fulfilled and the implementation of data quality improvement is not in accordance with the 2019 Complete City Guidelines in terms of territorial boundary inventory, and there are still obstacles in the implementation of improving the quality of land parcel data, namely the absence of buku tanah, surat ukur, and gambar ukur at the land office, the existence of regional division, the boundaries of the sub district are not yet certain, and the existence of land parcels that have been separated from mapping without being noticed by the office administrator.


2021 ◽  
Vol 23 (06) ◽  
pp. 1011-1018
Author(s):  
Aishrith P Rao ◽  
◽  
Raghavendra J C ◽  
Dr. Sowmyarani C N ◽  
Dr. Padmashree T ◽  
...  

With the advancement of technology and the large volume of data produced, processed, and stored, it is becoming increasingly important to maintain the quality of data in a cost-effective and productive manner. The most important aspects of Big Data (BD) are storage, processing, privacy, and analytics. The Big Data group has identified quality as a critical aspect of its maturity. Nonetheless, it is a critical approach that should be adopted early in the lifecycle and gradually extended to other primary processes. Companies are very reliant and drive profits from the huge amounts of data they collect. When its consistency deteriorates, the ramifications are uncertain and may result in completely undesirable conclusions. In the sense of BD, determining data quality is difficult, but it is essential that we uphold the data quality before we can proceed with any analytics. We investigate data quality during the stages of data gathering, preprocessing, data repository, and evaluation/analysis of BD processing in this paper. The related solutions are also suggested based on the elaboration and review of the proposed problems.


Information ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 175 ◽  
Author(s):  
Tibor Koltay

This paper focuses on the characteristics of research data quality, and aims to cover the most important issues related to it, giving particular attention to its attributes and to data governance. The corporate word’s considerable interest in the quality of data is obvious in several thoughts and issues reported in business-related publications, even if there are apparent differences between values and approaches to data in corporate and in academic (research) environments. The paper also takes into consideration that addressing data quality would be unimaginable without considering big data.


Sign in / Sign up

Export Citation Format

Share Document