Data Quality Issues in Data Migration

Author(s):  
Nurhidayah Muhamad Zahari ◽  
Wan Ya Wan Hussin ◽  
Mohd Yunus Mohd Yussof ◽  
Fauzi Mohd Saman
Author(s):  
Christopher D O’Connor ◽  
John Ng ◽  
Dallas Hill ◽  
Tyler Frederick

Policing is increasingly being shaped by data collection and analysis. However, we still know little about the quality of the data police services acquire and utilize. Drawing on a survey of analysts from across Canada, this article examines several data collection, analysis, and quality issues. We argue that as we move towards an era of big data policing it is imperative that police services pay more attention to the quality of the data they collect. We conclude by discussing the implications of ignoring data quality issues and the need to develop a more robust research culture in policing.


2021 ◽  
Author(s):  
Susan Walsh

Dirty data is a problem that costs businesses thousands, if not millions, every year. In organisations large and small across the globe you will hear talk of data quality issues. What you will rarely hear about is the consequences or how to fix it.<br><br><i>Between the Spreadsheets: Classifying and Fixing Dirty Data</i> draws on classification expert Susan Walsh's decade of experience in data classification to present a fool-proof method for cleaning and classifying your data. The book covers everything from the very basics of data classification to normalisation, taxonomies and presents the author's proven <b>COAT</b> methodology, helping ensure an organisation's data is <b>Consistent</b>, <b>Organised</b>, <b>Accurate</b> and <b>Trustworthy</b>. A series of data horror stories outlines what can go wrong in managing data, and if it does, how it can be fixed. <br><br>After reading this book, regardless of your level of experience, not only will you be able to work with your data more efficiently, but you will also understand the impact the work you do with it has, and how it affects the rest of the organisation.<br><br>Written in an engaging and highly practical manner, <i>Between the Spreadsheets</i> gives readers of all levels a deep understanding of the dangers of dirty data and the confidence and skills to work more efficiently and effectively with it.


Author(s):  
Syed Mustafa Ali ◽  
Farah Naureen ◽  
Arif Noor ◽  
Maged Kamel N. Boulos ◽  
Javariya Aamir ◽  
...  

Background Increasingly, healthcare organizations are using technology for the efficient management of data. The aim of this study was to compare the data quality of digital records with the quality of the corresponding paper-based records by using data quality assessment framework. Methodology We conducted a desk review of paper-based and digital records over the study duration from April 2016 to July 2016 at six enrolled TB clinics. We input all data fields of the patient treatment (TB01) card into a spreadsheet-based template to undertake a field-to-field comparison of the shared fields between TB01 and digital data. Findings A total of 117 TB01 cards were prepared at six enrolled sites, whereas just 50% of the records (n=59; 59 out of 117 TB01 cards) were digitized. There were 1,239 comparable data fields, out of which 65% (n=803) were correctly matched between paper based and digital records. However, 35% of the data fields (n=436) had anomalies, either in paper-based records or in digital records. 1.9 data quality issues were calculated per digital patient record, whereas it was 2.1 issues per record for paper-based record. Based on the analysis of valid data quality issues, it was found that there were more data quality issues in paper-based records (n=123) than in digital records (n=110). Conclusion There were fewer data quality issues in digital records as compared to the corresponding paper-based records. Greater use of mobile data capture and continued use of the data quality assessment framework can deliver more meaningful information for decision making.


Author(s):  
Mohammed Ragheb Hakawati ◽  
Yasmin Yacob ◽  
Amiza Amir ◽  
Jabiry M. Mohammed ◽  
Khalid Jamal Jadaa

Extensible Markup Language (XML) is emerging as the primary standard for representing and exchanging data, with more than 60% of the total; XML considered the most dominant document type over the web; nevertheless, their quality is not as expected. XML integrity constraint especially XFD plays an important role in keeping the XML dataset as consistent as possible, but their ability to solve data quality issues is still intangible. The main reason is that old-fashioned data dependencies were basically introduced to maintain the consistency of the schema rather than that of the data. The purpose of this study is to introduce a method for discovering pattern tableaus for XML conditional dependencies to be used for enhancing XML document consistency as a part of data quality improvement phases. The notations of the conditional dependencies as new rules are designed mainly for improving data instance and extended traditional XML dependencies by enforcing pattern tableaus of semantically related constants. Subsequent to this, a set of minimal approximate conditional dependencies (XCFD, XCIND) is discovered and learned from the XML tree using a set of mining algorithms. The discovered patterns can be used as a Master data in order to detect inconsistencies that don’t respect the majority of the dataset.


2007 ◽  
Vol 78 (2) ◽  
pp. 140-151 ◽  
Author(s):  
John H. Giudice ◽  
Kurt J. Haroldson
Keyword(s):  

AI Magazine ◽  
2010 ◽  
Vol 31 (1) ◽  
pp. 65 ◽  
Author(s):  
Clint R. Bidlack ◽  
Michael P Wellman

Recent advances in enterprise web-based software have created a need for sophisticated yet user-friendly data quality solutions. A new category of data quality solutions are discussed that fill this need using intelligent matching and retrieval algorithms. Solutions are focused on customer and sales data and include real-time inexact search, batch processing, and data migration. Users are empowered to maintain higher quality data resulting in more efficient sales and marketing operations. Sales managers spend more time with customers and less time managing data.


Sign in / Sign up

Export Citation Format

Share Document