Linkage disequilibrium connects genetic records of relatives typed with disjoint genomic marker sets

Mapping Intimacies ◽

10.1101/345322 ◽

2018 ◽

Author(s):

Jaehee Kim ◽

Michael D. Edge ◽

Bridget F. B. Algee-Hewitt ◽

Jun Z. Li ◽

Noah A. Rosenberg

Keyword(s):

Linkage Disequilibrium ◽

Genetic Markers ◽

Forensic Genetics ◽

The Other ◽

Dna Profile ◽

Privacy Concerns ◽

Genome Wide ◽

Genomic Marker ◽

Close Relatives ◽

Sib Pairs

AbstractIn familial searching in forensic genetics, a query DNA profile is tested against a database to determine whether it represents a relative of a database entrant. We examine the potential for using linkage disequilibrium to identify pairs of profiles as belonging to relatives when the query and database rely on nonoverlapping genetic markers. Considering data on individuals genotyped with both microsatellites used in forensic applications and genome-wide SNPs, we find that ~30-32% of parent–offspring pairs and ~35-36% of sib pairs can be identified from the SNPs of one member of the pair and the microsatellites of the other. The method suggests the possibility of performing familial searches of microsatellite databases using query SNP profiles, or vice versa. It also reveals that privacy concerns arising from computations across multiple databases that share no genetic markers in common entail risks not only for database entrants, but for their close relatives as well.

Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1619944114 ◽

2017 ◽

Vol 114 (22) ◽

pp. 5671-5676 ◽

Cited By ~ 18

Author(s):

Michael D. Edge ◽

Bridget F. B. Algee-Hewitt ◽

Trevor J. Pemberton ◽

Jun Z. Li ◽

Noah A. Rosenberg

Keyword(s):

Linkage Disequilibrium ◽

Data Aggregation ◽

Short Tandem Repeats ◽

Tandem Repeats ◽

Genome Wide ◽

Genomic Marker ◽

Record Matching ◽

Privacy Risks ◽

Forensic Genetic ◽

Short Tandem

Combining genotypes across datasets is central in facilitating advances in genetics. Data aggregation efforts often face the challenge of record matching—the identification of dataset entries that represent the same individual. We show that records can be matched across genotype datasets that have no shared markers based on linkage disequilibrium between loci appearing in different datasets. Using two datasets for the same 872 people—one with 642,563 genome-wide SNPs and the other with 13 short tandem repeats (STRs) used in forensic applications—we find that 90–98% of forensic STR records can be connected to corresponding SNP records and vice versa. Accuracy increases to 99–100% when ∼30 STRs are used. Our method expands the potential of data aggregation, but it also suggests privacy risks intrinsic in maintenance of databases containing even small numbers of markers—including databases of forensic significance.

Genome-Wide Patterns of Homozygosity and Relevant Characterizations on the Population Structure in Piétrain Pigs

Genes ◽

10.3390/genes11050577 ◽

2020 ◽

Vol 11 (5) ◽

pp. 577

Author(s):

Huiwen Zhan ◽

Saixian Zhang ◽

Kaili Zhang ◽

Xia Peng ◽

Shengsong Xie ◽

...

Keyword(s):

Linkage Disequilibrium ◽

Population Size ◽

Effective Population Size ◽

Average Length ◽

The Other ◽

Runs Of Homozygosity ◽

Effective Population ◽

Inbreeding Coefficients ◽

Genome Wide ◽

Genomic Regions

Investigating the patterns of homozygosity, linkage disequilibrium, effective population size and inbreeding coefficients in livestock contributes to our understanding of the genetic diversity and evolutionary history. Here we used Illumina PorcineSNP50 Bead Chip to identify the runs of homozygosity (ROH) and estimate the linkage disequilibrium (LD) across the whole genome, and then predict the effective population size. In addition, we calculated the inbreeding coefficients based on ROH in 305 Piétrain pigs and compared its effect with the other two types of inbreeding coefficients obtained by different calculation methods. A total of 23,434 ROHs were detected, and the average length of ROH per individual was about 507.27 Mb. There was no regularity on how those runs of homozygosity distributed in genome. The comparisons of different categories suggested that the formation of long ROH was probably related with recent inbreeding events. Although the density of genes located in ROH core regions is lower than that in the other genomic regions, most of them are related with Piétrain commercial traits like meat qualities. Overall, the results provide insight into the way in which ROH is produced and the identified ROH core regions can be used to map the genes associated with commercial traits in domestic animals.

Forensic Genetics and Genotyping

Serbian Journal of Experimental and Clinical Research ◽

10.1515/sjecr-2016-0074 ◽

2019 ◽

Vol 20 (2) ◽

pp. 75-86

Author(s):

Katarina Vitoševic ◽

Danijela Todorovic ◽

Zivana Slovic ◽

Radica Zivkovic-Zaric ◽

Milos Todorovic

Keyword(s):

Genetic Markers ◽

Short Tandem Repeats ◽

Tandem Repeats ◽

Forensic Genetics ◽

Personal Identification ◽

Hair Color ◽

Forensic Dna ◽

Kinship Analysis ◽

Dna Profile ◽

Short Tandem

Abstract Forensic genetics represents a combination of molecular and population genetics. Personal identification and kinship analysis (e.g. paternity testing) are the two main subjects of forensic DNA analysis. Biological specimens from which DNA is isolated are blood, semen, saliva, tissues, bones, teeth, hairs. Genotyping has become a basis in the characterization of forensic biological evidence. It is performed using a variety of genetic markers, which are divided into two large groups: bi-allelic (single-nucleotide polymorphisms, SNP) and multi-allelic polymorphisms (variable number of tandem repeats, VNTR and short tandem repeats, STR). This review describes the purpose of genetic markers in forensic investigation and their limitations. The STR loci are currently the most informative genetic markers for identity testing, but in cases without a suspect SNP can predict offender’s ancestry and phenotype traits such as skin, eyes and hair color. Nowadays, many countries worldwide have established forensic DNA databases based on autosomal short tandem repeats and other markers. In order for DNA profile database to be useful at a national or international level, it is essential to standardize genetic markers used in laboratories.

Linkage of genetic markers on human chromosomes 20 and 12 to NIDDM in Caucasian sib pairs with a history of diabetic nephropathy

Diabetes ◽

10.2337/diabetes.46.5.882 ◽

1997 ◽

Vol 46 (5) ◽

pp. 882-886 ◽

Cited By ~ 42

Author(s):

D. W. Bowden ◽

M. Sale ◽

T. D. Howard ◽

A. Qadri ◽

B. J. Spray ◽

...

Keyword(s):

Diabetic Nephropathy ◽

Genetic Markers ◽

Human Chromosomes ◽

History Of ◽

Sib Pairs

Inbreeding of Bottlenecked Butterfly Populations: Estimation Using the Likelihood of Changes in Marker Allele Frequencies

Genetics ◽

10.1093/genetics/151.3.1053 ◽

1999 ◽

Vol 151 (3) ◽

pp. 1053-1063 ◽

Cited By ~ 3

Author(s):

Ilik J Saccheri ◽

Ian J Wilson ◽

Richard A Nichols ◽

Michael W Bruford ◽

Paul M Brakefield

Keyword(s):

Linkage Disequilibrium ◽

Probability Distribution ◽

Reproductive Success ◽

Genetic Markers ◽

Allele Frequencies ◽

Demographic Parameters ◽

Bicyclus Anynana ◽

Marker Allele ◽

Wide Applicability ◽

Per Gene

Abstract Polymorphic enzyme and minisatellite loci were used to estimate the degree of inbreeding in experimentally bottlenecked populations of the butterfly, Bicyclus anynana (Satyridae), three generations after founding events of 2, 6, 20, or 300 individuals, each bottleneck size being replicated at least four times. Heterozygosity fell more than expected, though not significantly so, but this traditional measure of the degree of inbreeding did not make full use of the information from genetic markers. It proved more informative to estimate directly the probability distribution of a measure of inbreeding, σ2, the variance in the number of descendants left per gene. In all bottlenecked lines, σ2 was significantly larger than in control lines (300 founders). We demonstrate that this excess inbreeding was brought about both by an increase in the variance of reproductive success of individuals, but also by another process. We argue that in bottlenecked lines linkage disequilibrium generated by the small number of haplotypes passing through the bottleneck resulted in hitchhiking of particular marker alleles with those haplotypes favored by selection. In control lines, linkage disequilibrium was minimal. Our result, indicating more inbreeding than expected from demographic parameters, contrasts with the findings of previous (Drosophila) experiments in which the decline in observed heterozygosity was slower than expected and attributed to associative overdominance. The different outcomes may both be explained as a consequence of linkage disequilibrium under different regimes of inbreeding. The likelihood-based method to estimate inbreeding should be of wide applicability. It was, for example, able to resolve small differences in σ2 among replicate lines within bottleneck-size treatments, which could be related to the observed variation in reproductive viability.

Genome-wide linkage disequilibrium in two Japanese beef cattle breeds

Animal Genetics ◽

10.1111/j.1365-2052.2005.01400.x ◽

2006 ◽

Vol 37 (2) ◽

pp. 139-144 ◽

Cited By ~ 17

Author(s):

M. Odani ◽

A. Narita ◽

T. Watanabe ◽

K. Yokouchi ◽

Y. Sugimoto ◽

...

Keyword(s):

Linkage Disequilibrium ◽

Beef Cattle ◽

Cattle Breeds ◽

Genome Wide ◽

Beef Cattle Breeds ◽

Genome Wide Linkage Disequilibrium

The Impact of Incomplete Linkage Disequilibrium and Genetic Model Choice on the Analysis and Interpretation of Genome-wide Association Studies

Annals of Human Genetics ◽

10.1111/j.1469-1809.2010.00579.x ◽

2010 ◽

Vol 74 (4) ◽

pp. 375-379 ◽

Cited By ~ 6

Author(s):

Mark M. Iles

Keyword(s):

Linkage Disequilibrium ◽

Genetic Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Model Choice ◽

Genome Wide ◽

The Impact

Genome-wide evaluation of genetic diversity and linkage disequilibrium in winter and spring triticale (x Triticosecale Wittmack)

BMC Genomics ◽

10.1186/1471-2164-13-235 ◽

2012 ◽

Vol 13 (1) ◽

pp. 235 ◽

Cited By ~ 22

Author(s):

Katharina V Alheit ◽

Hans Maurer ◽

Jochen C Reif ◽

Matthew R Tucker ◽

Volker Hahn ◽

...

Keyword(s):

Genetic Diversity ◽

Linkage Disequilibrium ◽

X Triticosecale ◽

Genome Wide ◽

Spring Triticale ◽

Triticosecale Wittmack ◽

X Triticosecale Wittmack

mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes

BMC Bioinformatics ◽

10.1186/s12859-020-03945-0 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Bing Song ◽

August E. Woerner ◽

John Planz

Keyword(s):

Population Genetics ◽

Linkage Disequilibrium ◽

Genetic Markers ◽

Software Package ◽

Tandem Repeats ◽

Population Data ◽

Real Data ◽

R Package ◽

Nucleotide Polymorphisms ◽

Mutual Independence

Abstract Background Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data. Results This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package “mixIndependR” calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy–Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested. Conclusion The package “mixIndependR” is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package “mixIndependR” makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology. Availability The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html.

Challenges of Adjusting Single-Nucleotide Polymorphism Effect Sizes for Linkage Disequilibrium

Human Heredity ◽

10.1159/000513303 ◽

2021 ◽

pp. 1-11

Author(s):

Valentina Escott-Price ◽

Karl Michael Schmidt

Keyword(s):

Linkage Disequilibrium ◽

Association Studies ◽

Statistical Significance ◽

Ordinary Least Squares ◽

Effect Sizes ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Single Nucleotide ◽

Genome Wide ◽

Tikhonov Regularisation

Background: Genome-wide association studies (GWAS) were successful in identifying SNPs showing association with disease, but their individual effect sizes are small and require large sample sizes to achieve statistical significance. Methods of post-GWAS analysis, including gene-based, gene-set and polygenic risk scores, combine the SNP effect sizes in an attempt to boost the power of the analyses. To avoid giving undue weight to SNPs in linkage disequilibrium (LD), the LD needs to be taken into account in these analyses. Objectives: We review methods that attempt to adjust the effect sizes (β-coefficients) of summary statistics, instead of simple LD pruning. Methods: We subject LD adjustment approaches to a mathematical analysis, recognising Tikhonov regularisation as a framework for comparison. Results: Observing the similarity of the processes involved with the more straightforward Tikhonov-regularised ordinary least squares estimate for multivariate regression coefficients, we note that current methods based on a Bayesian model for the effect sizes effectively provide an implicit choice of the regularisation parameter, which is convenient, but at the price of reduced transparency and, especially in smaller LD blocks, a risk of incomplete LD correction. Conclusions: There is no simple answer to the question which method is best, but where interpretability of the LD adjustment is essential, as in research aiming at identifying the genomic aetiology of disorders, our study suggests that a more direct choice of mild regularisation in the correction of effect sizes may be preferable.