A Case Study on Data Quality, Privacy, and Entity Resolution

Author(s):  
William Decker ◽  
Fan Liu ◽  
John Talburt ◽  
Pei Wang ◽  
Ningning Wu

This chapter presents ongoing research conducted through collaboration between the University of Arkansas at Little Rock and the Arkansas Department of Education to develop an entity resolution and identity management system. The process includes a multi-phase approach consisting of data-quality analysis, selection of entity-identity attributes for entity resolution, development of a truth-set, and implementation and benchmarking of an entity-resolution rule set using the open source entity-resolution system named OYSTER. The research is the first known of its kind to evaluate privacy-enhancing, entity-resolution rule sets in a state education agency.

Author(s):  
Pei Wang ◽  
Daniel Pullen ◽  
Fan Liu ◽  
William C. Decker ◽  
Ningning Wu ◽  
...  

This paper presents ongoing research conducted through collaboration between the University of Arkansas at Little Rock and the Arkansas Department of Education to develop an entity resolution and identity management system. The process includes a multi-phase approach consisting of data-quality analysis, selection of entity-identity attributes for entity resolution, defined a rule set using the open source entity-resolution system named OYSTER and used entropy approach to identify the potential false positive and false negative. The research is the first known of its kind to evaluate privacy-enhancing, entity-resolution rule sets in a state education agency.


Laws ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 38
Author(s):  
Michael Rozalski ◽  
Mitchell L. Yell ◽  
Jacob Warner

In 1975, the Education for All Handicapped Children Act (renamed the Individuals with Disabilities Education Act in 1990) established the essential obligation of special education law, which is to develop a student’s individualized special education program that enables them to receive a free appropriate public education (FAPE). FAPE was defined in the federal law as special education and related services that: (a) are provided at public expense, (b) meet the standards of the state education agency, (c) include preschool, elementary, or secondary education, and (d) are provided in conformity with a student’s individualized education program (IEP). Thus, the IEP is the blueprint of an individual student’s FAPE. The importance of FAPE has been shown in the number of disputes that have arisen over the issue. In fact 85% to 90% of all special education litigation involves disagreements over the FAPE that students receive. FAPE issues boil down to the process and content of a student’s IEP. In this article, we differentiate procedural (process) and substantive (content) violations and provide specific guidance on how to avoid both process and content errors when drafting and implementing students’ IEPs.


2008 ◽  
pp. 3067-3084
Author(s):  
John Talburt ◽  
Richard Wang ◽  
Kimberly Hess ◽  
Emily Kuo

This chapter introduces abstract algebra as a means of understanding and creating data quality metrics for entity resolution, the process in which records determined to represent the same real-world entity are successively located and merged. Entity resolution is a particular form of data mining that is foundational to a number of applications in both industry and government. Examples include commercial customer recognition systems and information sharing on “persons of interest” across federal intelligence agencies. Despite the importance of these applications, most of the data quality literature focuses on measuring the intrinsic quality of individual records than the quality of record grouping or integration. In this chapter, the authors describe current research into the creation and validation of quality metrics for entity resolution, primarily in the context of customer recognition systems. The approach is based on an algebraic view of the system as creating a partition of a set of entity records based on the indicative information for the entities in question. In this view, the relative quality of entity identification between two systems can be measured in terms of the similarity between the partitions they produce. The authors discuss the difficulty of applying statistical cluster analysis to this problem when the datasets are large and propose an alternative index suitable for these situations. They also report some preliminary experimental results, and outlines areas and approaches to further research in this area.


Author(s):  
B. C. Scheffers ◽  
E. C. C. Wildeboer Schut ◽  
J. A. C. Meekes ◽  
H. L. H. Cox

Sign in / Sign up

Export Citation Format

Share Document