match probability
Recently Published Documents


TOTAL DOCUMENTS

51
(FIVE YEARS 14)

H-INDEX

10
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Martin Courtois ◽  
Alexandre Filiot ◽  
Gregoire Ficheur

The use of international laboratory terminologies inside hospital information systems is required to conduct data reuse analyses through inter-hospital databases. While most terminology matching techniques performing semantic interoperability are language-based, another strategy is to use distribution matching that performs terms matching based on the statistical similarity. In this work, our objective is to design and assess a structured framework to perform distribution matching on concepts described by continuous variables. We propose a framework that combines distribution matching and machine learning techniques. Using a training sample consisting of correct and incorrect correspondences between different terminologies, a match probability score is built. For each term, best candidates are returned and sorted in decreasing order using the probability given by the model. Searching 101 terms from Lille University Hospital among the same list of concepts in MIMIC-III, the model returned the correct match in the top 5 candidates for 96 of them (95%). Using this open-source framework with a top-k suggestions system could make the expert validation of terminologies alignment easier.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Aditi Mishra ◽  
Archana Kumari ◽  
Sumit Choudhary ◽  
Ulhas Gondhali

Abstract Background Today, when forensic experts talk about quantifiable hereditary traits, they do not just depend on the assessment and examination of DNA profiles but also relate them to the population structures. The use of high-throughput molecular marker technologies and advanced statistical and software tools have improved the accuracy of human genetic diversity analysis in many populations with limited time and resources. The present study aimed to investigate the genomic diversity in Gujarat’s Rabari population, using 20 autosomal genetic markers. Numerous bio-statistical software programs are available for the interpretation of population data in forensics. These statistics deal with the measurement of uncertainty and also provides a probability of a random match. The present paper aims to provide a practical guide to the analysis of population genetics data. Three statistical software packages named Cervus, Genepop, and Fstat are compared and contrasted. The comparison is performed on the profiles obtained from fifty unrelated blood samples of healthy male individuals. DNA was extracted using the organic extraction method, 20 autosomal STR loci were amplified using PowerPlex 21 kit (Promega, Madison, WI, USA) and detected on 3100 Genetic Analyser (Life Technologies Corporation, Carlsbad, CA, USA). Results A total of 170 alleles were observed in the Rabari Tribe of Gujarat population, and allele frequencies ranged from 0.010 to 0.480. The highest allele frequency detected was 0.480 for allele 9 at locus TH01. Based on heterozygosity and the polymorphism information content, FGA may be considered as the most informative markers. Both the combined power of discrimination (CPD) and the combined power of exclusion (CPE) for the 20 analyzed loci were higher than 0.999999. The combined match probability (CPM) for all 20 loci was 2.5 × 10−22. Conclusions With respect to the results, the 20 STR loci are highly polymorphic and discriminating in the Gujarat population and could be used for forensic practice and population genetics studies. However, Fstat demonstrated better genetic software for analysis of the demographic structure of a specific or set of populations.


Author(s):  
Lirieka Meintjes-Van der Walt ◽  
Priviledge Dhliwayo

The sufficiency of DNA evidence alone, with regard to convicting accused persons, has been interrogated and challenged in criminal cases. The availability of offender databases and the increasing sophistication of crime scene recovery of evidence have resulted in a new type of prosecution in which the State's case focuses on match statistics to explain the significance of a match between the accused's DNA profile and the crime-scene evidence. A number of such cases have raised critical jurisprudential questions about the proper role of probabilistic evidence, and the misapprehension of match statistics by courts. This article, with reference to selected cases from specific jurisdictions, investigates the issue of DNA evidence as the exclusive basis for conviction and important factors such as primary, secondary and tertiary transfer, contamination, cold hits and match probability which can influence the reliability of basing a conviction on DNA evidence alone, are discussed.


Author(s):  
Maan Hasan Salih ◽  
Akeel Hussain Ali Al-Assie ◽  
Majeed Arsheed Sabbah

Short tandem repeats (STRs) have been recommended as the highest polymorphic loci among the humana DNA regions. Therefore, STRs are agreeable to many genetic fields like forensic, population genetics and anthropological studies. The main aim of this research is to evaluate the autosomal STRs in Tikrit city-Iraq, to expand the human genetics database and forensic genetics analysis. The DNA database was obtained from 306 unrelated volunteers from native Tikrit population-Iraq, using 15 autosomal STR loci. The current study determined the allele frequencies in the Tikrit population and then compared them with other national Iraqi populations as well as with populations in the Middle East, Africa, and Europe. The highest level of heterozygosity was observed in D8S1179 and TH01 loci (0.797), while the less level was shown by CSF1PO (0.48). The departure from HWE Equilibrium was recorded in only 3 STR loci from a total of 15 loci analyzed (p<0.003). The Combined Match Probability (CMP) for 15 autosomal STR was 1 in 7.89208×10-19 and the Combined Discrimination Power (CDP) was 0.9999999997. The discrimination power (DP) was especially high in D2S1338, D18S51, D19S433 and D21S11. Based on the results observed in a Dendrogram, Tikrit population was clustered with other populations, likely reflecting the historical and geographical factors. D2S1338, D18S51, D19S433 and D21S11 markers were recognized as suitable for forensic genetics analysis in Tikrit population. Also, the 15 STRs markers provide information for the studies of genetic distances between the current study and other included populations to be compared with this study.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Noora R. Al-Snan ◽  
Sabah Shabbir ◽  
Sahar S. Baksh ◽  
Mashael AlQerainees ◽  
Mahdi Haidar ◽  
...  

AbstractThis paper evaluates the forensic utility of 30 insertion-deletion polymorphism (indel) markers in a sample from the Bahraini population using the Qiagen Investigator DIPplex Kit. Allele frequencies and forensic stats of the 30 indels were investigated in 293 unrelated individuals from different governorates of the Kingdom of Bahrain. None of the markers showed significant deviation from Hardy Weinberg equilibrium except for HLD88 locus and no linkage disequilibrium were detected between all possible pair of the indel loci, assuming that these markers are independent and their allele frequencies can be used to calculate the match probabilities in the Bahraini population. The high power of discrimination (CPD = 0.9999999999998110) and the low combined match probability (CPM = 1.89 × 10−13) indicate that these markers are informative and can be successfully used for human identification in terms of forensics and paternity. Genetic distances and relatedness were displayed through multidimensional plotting and phylogenetic tree using various populations in the region. Our study showed that the Bahraini population was clustered with neighboring countries such as Kuwait and Emirates which indicates that these closely geographical regions share similar allele frequencies and are more genetically related than other reference population studied.


2021 ◽  
Author(s):  
Eric Agola Lelo ◽  
Johnson Kinyua ◽  
Eva Kalamera Aluvaala ◽  
William Chege Kiarie ◽  
Carlo Wamaitha Chege

Abstract Samples from 180 unrelated persons of Kenyan descent collected at a DNA testing facility in Nairobi were genotyped using the PowerPlex21® STR kit to generate the first indigenous 20 autosomal STR allele frequency table for use in forensic analysis of human DNA in Kenya. Informed consent for use of the samples for this study was obtained with de-identification procedures employed in accordance with recommendations from the Scientific and Ethics Review Unit at the Kenya Medical Research Institute (KEMRI). The markers amplified for the generation of the allele frequency table were D3S1358, D13S317, PentaE, D16S539, D18S51, D2S1338, CSF1PO, Penta D,THO1, vWA, D21S11, D7S820, TPOX, D8S1179, FGA, D2S1338, D5S818, D6S1043, D12S391, and D19S433. A high degree of gene diversity was observed in this population with average PIC values and heterozygosity of 0.799 and0.831 respectively across the 20 loci. Cumulatively, 182 alleles were detected in the Kenyan population analysed across the 20 STR loci. The lowest allele frequency value was 0.003 where one occurrence of the allele was observed while the highest allele frequency was 0.36 for allele 16 marker D3S1358. Polymorphism information content (PIC) results ranged from0.69 to 0.90 with Penta E returning the highest score. The high PIC score shows that the additional markers offer amore informative value of the genetic markers in this data set. The power of discrimination ranged from 89% to97% with a combined power of discrimination of 99.99%. The combined match probability, a measure in population genetics that is used to measure the chance of an unrelated person, arbitrarily picked out of the common population and having an identical genotype as that derived from the reference sample or the evidence, was 4.34 x 10-26.The dataset generated in the present study has been demonstrated to be highly valuable in discriminating between two individual genotypes and greatly amplifies the power of discrimination available to Kenyan forensic DNA testing facilities. The loci included in this dataset comprise the commonly used loci in the US, Europe and Asia and the development of this allele frequency table increases the data sharing possibilities between local and international forensic DNA testing facilities.


2021 ◽  
pp. injuryprev-2020-044101
Author(s):  
Allison E Curry ◽  
Melissa R Pfeiffer ◽  
Kristina B Metzger ◽  
Meghan E Carey ◽  
Lawrence J Cook

ObjectiveOur objective was to describe the development of the New Jersey Safety and Health Outcomes (NJ-SHO) data warehouse—a unique and comprehensive data source that integrates state-wide administrative databases in NJ to enable the field of injury prevention to address critical, high-priority research questions.MethodsWe undertook an iterative process to link data from six state-wide administrative databases from NJ for the period of 2004 through 2018: (1) driver licensing histories, (2) traffic-related citations and suspensions, (3) police-reported crashes, (4) birth certificates, (5) death certificates and (6) hospital discharges (emergency department, inpatient and outpatient). We also linked to electronic health records of all NJ patients of the Children’s Hospital of Philadelphia network, census tract-level indicators (using geocoded residential addresses) and state-wide Medicaid/Medicare data. We used several metrics to evaluate the quality of the linkage process.ResultsAfter the linkage process was complete, the NJ-SHO data warehouse included linked records for 22.3 million distinct individuals. Our evaluation of this linkage suggests that the linkage was of high quality: (1) the median match probability—or likelihood of a match being true—among all accepted pairs was 0.9999 (IQR: 0.9999–1.0000); and (2) the false match rate—or proportion of accepted pairs that were false matches—was 0.0063.ConclusionsThe resulting NJ-SHO warehouse is one of the most comprehensive and rich longitudinal sources of injury data to date. The warehouse has already been used to support numerous studies and is primed to support a host of rigorous studies in the field of injury prevention.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 221
Author(s):  
Michele Ragazzo ◽  
Giulio Puleri ◽  
Valeria Errichiello ◽  
Laura Manzo ◽  
Laura Luzzi ◽  
...  

A custom plate of OpenArray™ technology was evaluated to test 60 single-nucleotide polymorphisms (SNPs) validated for the prediction of eye color, hair color, and skin pigmentation, and for personal identification. The SNPs were selected from already validated subsets (Hirisplex-s, Precision ID Identity SNP Panel, and ForenSeq DNA Signature Prep Kit). The concordance rate and call rate for every SNP were calculated by analyzing 314 sequenced DNA samples. The sensitivity of the assay was assessed by preparing a dilution series of 10.0, 5.0, 1.0, and 0.5 ng. The OpenArray™ platform obtained an average call rate of 96.9% and a concordance rate near 99.8%. Sensitivity testing performed on serial dilutions demonstrated that a sample with 0.5 ng of total input DNA can be correctly typed. The profiles of the 19 SNPs selected for human identification reached a random match probability (RMP) of, on average, 10−8. An analysis of 21 examples of biological evidence from 8 individuals, that generated single short tandem repeat profiles during the routine workflow, demonstrated the applicability of this technology in real cases. Seventeen samples were correctly typed, revealing a call rate higher than 90%. Accordingly, the phenotype prediction revealed the same accuracy described in the corresponding validation data. Despite the reduced discrimination power of this system compared to STR based kits, the OpenArray™ System can be used to exclude suspects and prioritize samples for downstream analyses, providing well-established information about the prediction of eye color, hair color, and skin pigmentation. More studies will be needed for further validation of this technology and to consider the opportunity to implement this custom array with more SNPs to obtain a lower RMP and to include markers for studies of ancestry and lineage.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mahdi Haidar ◽  
Fatimah A. Abbas ◽  
Hussain Alsaleh ◽  
Penelope R. Haddrill

AbstractThis study evaluates the forensic utility of 23 autosomal short tandem repeat markers in 400 samples from the Kuwaiti population, of which four markers (D10S1248, D22S1045, D2S441 and SE33) are reported for the first time for Kuwait. All the markers were shown to exhibit no deviation from Hardy–Weinberg equilibrium, nor any linkage disequilibrium between and within loci, indicating that these loci are inherited independently, and their allele frequencies can be used to estimate match probabilities in the Kuwaiti population. The low combined match probability of 7.37 × 10–30 and the high paternity indices generated by these loci demonstrate the usefulness of the PowerPlex Fusion 6C kit for human identification in this population, as well as to strengthen the power of paternity testing. Off-ladder alleles were seen at several loci, and these were identified by examining their underlying nucleotide sequences. Principal component analysis (PCA) and STRUCTURE showed no genetic structure within the Kuwaiti population. However, PCA revealed a correlation between geographic and genetic distance. Finally, phylogenetic trees demonstrated a close relationship between Kuwaitis and Middle Easterners at a global level, and a recent common ancestry for Kuwait with its northern neighbours of Iraq and Iran, at a regional level.


2020 ◽  
pp. 15-33
Author(s):  
Henry Erlich

Chapter 1 reviews the history of DNA analysis for individual identification in criminal cases. The principles underlying Restriction Fragment Length Polymorphism (RFLP) and Polymerase Chain Reaction (PCR) and their application in the first cases in the US and the UK in the mid-‘80s are discussed. The differences between these two DNA technologies (RFLP and PCR) are discussed and the evolution of new PCR-based genotyping methods for analyzing length and sequence polymorphisms is reviewed. The first DNA exoneration, which used the PCR-based HLA-DQ alpha test, is discussed in the context of exclusionary and inclusionary DNA results. The statistical issues involved in interpreting a match (inclusion) between the genetic profile of the evidence and the reference samples by calculating the Random Match Probability metric is discussed. Finally, the contentious history of the debate about the admissibility of DNA results in the courtroom, known as the “DNA Wars” is reviewed.


Sign in / Sign up

Export Citation Format

Share Document