A Case Study on Data Quality, Privacy, and Evaluating the Outcome of Entity Resolution Processes

This paper presents ongoing research conducted through collaboration between the University of Arkansas at Little Rock and the Arkansas Department of Education to develop an entity resolution and identity management system. The process includes a multi-phase approach consisting of data-quality analysis, selection of entity-identity attributes for entity resolution, defined a rule set using the open source entity-resolution system named OYSTER and used entropy approach to identify the potential false positive and false negative. The research is the first known of its kind to evaluate privacy-enhancing, entity-resolution rule sets in a state education agency.

Download Full-text

A Case Study on Data Quality, Privacy, and Entity Resolution

Information Quality and Governance for Business Intelligence - Advances in Business Strategy and Competitive Advantage ◽

10.4018/978-1-4666-4892-0.ch004 ◽

2014 ◽

pp. 66-87

Author(s):

William Decker ◽

Fan Liu ◽

John Talburt ◽

Pei Wang ◽

Ningning Wu

Keyword(s):

Data Quality ◽

Identity Management ◽

Entity Resolution ◽

Quality Analysis ◽

State Education ◽

Resolution System ◽

Resolution Rule ◽

Rule Sets ◽

Education Agency ◽

Multi Phase

This chapter presents ongoing research conducted through collaboration between the University of Arkansas at Little Rock and the Arkansas Department of Education to develop an entity resolution and identity management system. The process includes a multi-phase approach consisting of data-quality analysis, selection of entity-identity attributes for entity resolution, development of a truth-set, and implementation and benchmarking of an entity-resolution rule set using the open source entity-resolution system named OYSTER. The research is the first known of its kind to evaluate privacy-enhancing, entity-resolution rule sets in a state education agency.

Download Full-text

Data quality analysis of the station of geological and technological researches in recognizing losses and kicks to improve the prediction accuracy of neural network algorithms

Neftyanoe khozyaystvo - Oil Industry ◽

10.24887/0028-2448-2020-8-63-67 ◽

2020 ◽

pp. 63-67 ◽

Cited By ~ 1

Author(s):

A.I. Arkhipov ◽

◽

A.N. Dmitrievsky ◽

N.A. Eremin ◽

A.D. Chernikov ◽

...

Keyword(s):

Neural Network ◽

Data Quality ◽

Prediction Accuracy ◽

Quality Analysis ◽

Network Algorithms ◽

Data Quality Analysis

Download Full-text

Free Appropriate Public Education, the U.S. Supreme Court, and Developing and Implementing Individualized Education Programs

Laws ◽

10.3390/laws10020038 ◽

2021 ◽

Vol 10 (2) ◽

pp. 38

Author(s):

Michael Rozalski ◽

Mitchell L. Yell ◽

Jacob Warner

Keyword(s):

Special Education ◽

Public Education ◽

Education Program ◽

Individualized Education ◽

Related Services ◽

Free Appropriate Public Education ◽

Substantive Content ◽

State Education ◽

Education Agency ◽

Education Act

In 1975, the Education for All Handicapped Children Act (renamed the Individuals with Disabilities Education Act in 1990) established the essential obligation of special education law, which is to develop a student’s individualized special education program that enables them to receive a free appropriate public education (FAPE). FAPE was defined in the federal law as special education and related services that: (a) are provided at public expense, (b) meet the standards of the state education agency, (c) include preschool, elementary, or secondary education, and (d) are provided in conformity with a student’s individualized education program (IEP). Thus, the IEP is the blueprint of an individual student’s FAPE. The importance of FAPE has been shown in the number of disputes that have arisen over the issue. In fact 85% to 90% of all special education litigation involves disagreements over the FAPE that students receive. FAPE issues boil down to the process and content of a student’s IEP. In this article, we differentiate procedural (process) and substantive (content) violations and provide specific guidance on how to avoid both process and content errors when drafting and implementing students’ IEPs.

Download Full-text

A Comprehensive State Education Agency Plan to Promote the Integration of Students with Moderate/Severe Handicaps

Journal of the Association for Persons with Severe Handicaps ◽

10.1177/154079699001500207 ◽

1990 ◽

Vol 15 (2) ◽

pp. 106-113 ◽

Cited By ~ 6

Author(s):

Susan Hamre-Nietupski ◽

John Nietupski ◽

Steve Maurer

Keyword(s):

State Education Agency ◽

State Education ◽

Education Agency ◽

Severe Handicaps

Download Full-text

An Algebraic Approach to Data Quality Metrics for Entity Resolution over Large Datasets

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch196 ◽

2008 ◽

pp. 3067-3084

Author(s):

John Talburt ◽

Richard Wang ◽

Kimberly Hess ◽

Emily Kuo

Keyword(s):

Data Quality ◽

Algebraic Approach ◽

Entity Resolution ◽

Quality Metrics ◽

Quality Literature ◽

Intrinsic Quality ◽

Recognition Systems ◽

Entity Identification ◽

Data Quality Metrics

This chapter introduces abstract algebra as a means of understanding and creating data quality metrics for entity resolution, the process in which records determined to represent the same real-world entity are successively located and merged. Entity resolution is a particular form of data mining that is foundational to a number of applications in both industry and government. Examples include commercial customer recognition systems and information sharing on “persons of interest” across federal intelligence agencies. Despite the importance of these applications, most of the data quality literature focuses on measuring the intrinsic quality of individual records than the quality of record grouping or integration. In this chapter, the authors describe current research into the creation and validation of quality metrics for entity resolution, primarily in the context of customer recognition systems. The approach is based on an algebraic view of the system as creating a partition of a set of entity records based on the indicative information for the entities in question. In this view, the relative quality of entity identification between two systems can be measured in terms of the similarity between the partitions they produce. The authors discuss the difficulty of applying statistical cluster analysis to this problem when the datasets are large and propose an alternative index suitable for these situations. They also report some preliminary experimental results, and outlines areas and approaches to further research in this area.

Download Full-text

A Field case of data quality analysis

58th EAEG Meeting ◽

10.3997/2214-4609.201408617 ◽

1996 ◽

Author(s):

B. C. Scheffers ◽

E. C. C. Wildeboer Schut ◽

J. A. C. Meekes ◽

H. L. H. Cox

Keyword(s):

Data Quality ◽

Quality Analysis ◽

Field Case ◽

Data Quality Analysis

Download Full-text

Design of Logging-While-Drilling Data Quality Analysis Algorithms for A Geosteering Software Module

10.3997/2214-4609.201901883 ◽

2019 ◽

Author(s):

A. Perlova ◽

A. Karimov

Keyword(s):

Data Quality ◽

Quality Analysis ◽

Software Module ◽

Drilling Data ◽

Logging While Drilling ◽

Data Quality Analysis

Download Full-text

Data quality analysis at the Spitzer Science Center

10.1117/12.672403 ◽

2006 ◽

Author(s):

Vincent Mannings ◽

Russ R. Laher

Keyword(s):

Data Quality ◽

Quality Analysis ◽

Science Center ◽

Data Quality Analysis

Download Full-text

Situation-Dependent Data Quality Analysis for Geospatial Data Using Semantic Technologies

Business Information Systems Workshops - Lecture Notes in Business Information Processing ◽

10.1007/978-3-030-04849-5_49 ◽

2019 ◽

pp. 566-578 ◽

Cited By ~ 2

Author(s):

Timo Homburg ◽

Frank Boochs

Keyword(s):

Data Quality ◽

Geospatial Data ◽

Quality Analysis ◽

Dependent Data ◽

Semantic Technologies ◽

Data Quality Analysis

Download Full-text

How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases

Algorithms ◽

10.3390/a13050107 ◽

2020 ◽

Vol 13 (5) ◽

pp. 107 ◽

Cited By ~ 1

Author(s):

Otmane Azeroual ◽

Włodzimierz Lewoniewski

Keyword(s):

Data Quality ◽

Spatial Information ◽

Real Data ◽

Knowledge Bases ◽

Quality Analysis ◽

Quality Data ◽

Measure Data ◽

Research Information ◽

Scientific Publications ◽

Measurement Results

The quality assurance of publication data in collaborative knowledge bases and in current research information systems (CRIS) becomes more and more relevant by the use of freely available spatial information in different application scenarios. When integrating this data into CRIS, it is necessary to be able to recognize and assess their quality. Only then is it possible to compile a result from the available data that fulfills its purpose for the user, namely to deliver reliable data and information. This paper discussed the quality problems of source metadata in Wikipedia and CRIS. Based on real data from over 40 million Wikipedia articles in various languages, we performed preliminary quality analysis of the metadata of scientific publications using a data quality tool. So far, no data quality measurements have been programmed with Python to assess the quality of metadata from scientific publications in Wikipedia and CRIS. With this in mind, we programmed the methods and algorithms as code, but presented it in the form of pseudocode in this paper to measure the quality related to objective data quality dimensions such as completeness, correctness, consistency, and timeliness. This was prepared as a macro service so that the users can use the measurement results with the program code to make a statement about their scientific publications metadata so that the management can rely on high-quality data when making decisions.

Download Full-text