scholarly journals Multifile Partitioning for Record Linkage and Duplicate Detection

Author(s):  
Serge Aleshin-Guendel ◽  
Mauricio Sadinle
1983 ◽  
Vol 22 (02) ◽  
pp. 77-82 ◽  
Author(s):  
M. P. Mi ◽  
J. T. Kagawa ◽  
M. E. Earle

An operational approach to computerized record linkage has been developed based on the concept of probability of chance match in two groups of records brought together for comparison. Tolerance levels can be readily derived from these records for decision-making in accepting or rejecting a linked pair. This approach is especially suitable for iteration when linked pairs are removed in successive cycles. An application of linkage for death clearance of the 1942 resident population of 437,967 registered in Hawaii during a 38-year period from 1942 to 1979 is presented. The reliability of linkage and rate of failure were analyzed.


1979 ◽  
Vol 18 (02) ◽  
pp. 89-97 ◽  
Author(s):  
Martha E. Smith ◽  
H. B. Newcombe

Empirical tests of the application of computer record linkage methods versus the use of routine clerical searching, for bringing together various vital and ill-health records, have shown that the success rate for the computer operation was higher (98.3 versus 96.7 per cent) and the proportion of false linkages very much lower (0.1 versus 2.3 per cent). The rate at which the ill-health records were processed by the computer was approximately 14,000 per minute of central processor time, representing a cost of a half a cent apiece.Factors affecting the speed, accuracy and cost of computerized record linkage are discussed.


1969 ◽  
Vol 08 (01) ◽  
pp. 07-11 ◽  
Author(s):  
H. B. Newcombe

Methods are described for deriving personal and family histories of birth, marriage, procreation, ill health and death, for large populations, from existing civil registrations of vital events and the routine records of ill health. Computers have been used to group together and »link« the separately derived records pertaining to successive events in the lives of the same individuals and families, rapidly and on a large scale. Most of the records employed are already available as machine readable punchcards and magnetic tapes, for statistical and administrative purposes, and only minor modifications have been made to the manner in which these are produced.As applied to the population of the Canadian province of British Columbia (currently about 2 million people) these methods have already yielded substantial information on the risks of disease: a) in the population, b) in relation to various parental characteristics, and c) as correlated with previous occurrences in the family histories.


1997 ◽  
Vol 9 (1-3) ◽  
pp. 122-133 ◽  
Author(s):  
Peter Tilley ◽  
Christopher French

Should record linkage for nineteenth century census records be based on multiple pass algorithms using list unique records or are there more effective ways of establishing true matches? This paper considers both multiple pass algorithms and some alternatives, and finds that the alternatives can indeed be more effective.


1997 ◽  
Vol 9 (1-3) ◽  
pp. 150-155
Author(s):  
Peter Adman

In a recent issue of this journal (Vol.8 no.2) the paper ‘Record linkage theory and practice: an experiment in the application of multiple pass linkage algorithms’ by Charles Harvey, Edmund Green and Penelope J. Corfield described the advances the authors have made on their previously published work. By using a multiple pass methodology they increased the linkage rate between two successive polls (1784 and 1788) from one-fifth to nearly three-fifths of the voters in the parliamentary elections for the City of Westminster. This critique examines the validity of their claims with regard to the confidence levels attained.


Sign in / Sign up

Export Citation Format

Share Document