scholarly journals Variance in estimated pairwise genetic distance under high versus low coverage sequencing: The contribution of linkage disequilibrium

2017 ◽  
Vol 117 ◽  
pp. 51-63
Author(s):  
Max Shpak ◽  
Yang Ni ◽  
Jie Lu ◽  
Peter Müller
2017 ◽  
Author(s):  
Max Shpak ◽  
Yang Ni ◽  
Jie Lu ◽  
Peter Müller

AbstractThe mean pairwise genetic distance among haplotypes is an estimator of the population mutation rate θ and a standard measure of variation in a population. With the advent of next-generation sequencing (NGS) methods, this and other population parameters can be estimated under different modes of sampling. One approach is to sequence individual genomes with high coverage, and to calculate genetic distance over all sample pairs. The second approach, typically used for microbial samples or for tumor cells, is sequencing a large number of pooled genomes with very low individual coverage. With low coverage, pairwise genetic distances are calculated across independently sampled sites rather than across individual genomes. In this study, we show that the variance in genetic distance estimates is reduced with low coverage sampling if the mean pairwise linkage disequilibrium weighted by allele frequencies is positive. Practically, this means that if on average the most frequent alleles over pairs of loci are in positive linkage disequilibrium, low coverage sequencing results in improved estimates of θ, assuming similar per-site read depths. We show that this result holds under the expected distribution of allele frequencies and linkage disequilibria for an infinite sites model at mutation-drift equilibrium. From simulations, we find that the conditions for reduced variance only fail to hold in cases where variant alleles are few and at very low frequency. These results are applied to haplotype frequencies from a lung cancer tumor to compute the weighted linkage disequilibria and the expected error in estimated genetic distance using high versus low coverage.


2022 ◽  
Vol 12 ◽  
Author(s):  
Susanna Kar Pui Lau ◽  
Kenneth Sze Ming Li ◽  
Xin Li ◽  
Ka-Yan Tsang ◽  
Siddharth Sridhar ◽  
...  

Since its first discovery in 1967, human coronavirus OC43 (HCoV-OC43) has been associated with mild self-limiting upper respiratory infections worldwide. Fatal primary pneumonia due to HCoV-OC43 is not frequently described. This study describes a case of fatal primary pneumonia associated with HCoV-OC43 in a 75-year-old patient with good past health. The viral loads of the respiratory tract specimens (bronchoalveolar lavage and endotracheal aspirate) from diagnosis to death were persistently high (3.49 × 106–1.10 × 1010 copies/ml). HCoV-OC43 at a 6.46 × 103 copies/ml level was also detected from his pleural fluid 2 days before his death. Complete genome sequencing and phylogenetic analysis showed that the present HCoV-OC43 forms a distinct cluster with three other HCoV-OC43 from United States, with a bootstrap value of 100% and sharing 99.9% nucleotide identities. Pairwise genetic distance between this cluster and other HCoV-OC43 genotypes ranged from 0.27 ± 0.02% to 1.25 ± 0.01%. In contrast, the lowest pairwise genetic distance between existing HCoV-OC43 genotypes was 0.26 ± 0.02%, suggesting that this cluster constitutes a novel HCoV-OC43 genotype, which we named genotype I. Unlike genotypes D, E, F, G, and H, no recombination event was observed for this novel genotype. Structural modeling revealed that the loop with the S1/S2 cleavage site was four amino acids longer than other HCoV-OC43, making it more exposed and accessible to protease, which may have resulted in its possible hypervirulence.


Genetika ◽  
2018 ◽  
Vol 50 (2) ◽  
pp. 395-402
Author(s):  
Hüseyin Uysal ◽  
Zeki Acar ◽  
İlknur Ayan ◽  
Orhan Kurt

Fifty-one Lathyrus sativus L. landraces and one L. clymenum L. landrace collected from Turkey and one L. sativus cultivar, G?rb?z, were evaluated with ISSR markers in this study to molecular characterization. Three ISSR primers were used and 45 DNA fragment were evaluated, of which 44 bands were polymorphic. The frequencies of scored bands ranged from 0.009 to 0.888 and averaged 0.363. The genotypes MLT04 and NEV02 had high similarity with 0.825 according to pairwise grouping. The furthest pairwise group was G?rb?z and MLT02 with 0.244. The nearest genotype to G?rb?z was CNR03 with 0.577. The pairwise genetic distance between L. clymenum and L. sativus ranged from 0.353 (accession NEV01) to 0.637 (accession DEN04) and pairwise genetic distance to the cultivar G?rb?z was 0.375. Assessment of genetic relationships among Lathyrus genotypes made two main groups. One of them covered only G?rb?z variety and the other covered 52 Lathyrus landrace. L. sativus and L. clymenum separated prominently in the second group.


2010 ◽  
Vol 34 (2) ◽  
pp. 178-184
Author(s):  
Dong-dong XU ◽  
Feng YOU ◽  
Bao LOU ◽  
Jun LI ◽  
Jian-he XU ◽  
...  

Genetics ◽  
2018 ◽  
Vol 209 (2) ◽  
pp. 389-400 ◽  
Author(s):  
Timothy P. Bilton ◽  
John C. McEwan ◽  
Shannon M. Clarke ◽  
Rudiger Brauning ◽  
Tracey C. van Stijn ◽  
...  

1982 ◽  
Vol 39 (1) ◽  
pp. 63-77 ◽  
Author(s):  
Naoyuki Takahata

SummaryA general model of linked genes or a part of a genome is proposed which enables us to study various problems in molecular population genetics in a unified way. Several formulae with special reference to the linkage disequilibrium and genetic distance are derived for neutral mutations in finite populations, based on the method of diffusion equations. It is argued that the model and formulae are useful particularly when observations are made in terms of DNA sequence.


2014 ◽  
Author(s):  
Lin Huang ◽  
Bo Wang ◽  
Ruitang Chen ◽  
Sivan Bercovici ◽  
Serafim Batzoglou

Population low-coverage whole-genome sequencing is rapidly emerging as a prominent approach for discovering genomic variation and genotyping a cohort. This approach combines substantially lower cost than full-coverage sequencing with whole-genome discovery of low-allele-frequency variants, to an extent that is not possible with array genotyping or exome sequencing. However, a challenging computational problem arises when attempting to discover variants and genotype the entire cohort. Variant discovery and genotyping are relatively straightforward on a single individual that has been sequenced at high coverage, because the inference decomposes into the independent genotyping of each genomic position for which a sufficient number of confidently mapped reads are available. However, in cases where low-coverage population data are given, the joint inference requires leveraging the complex linkage disequilibrium patterns in the cohort to compensate for sparse and missing data in each individual. The potentially massive computation time for such inference, as well as the missing data that confound low-frequency allele discovery, need to be overcome for this approach to become practical. Here, we present Reveel, a novel method for single nucleotide variant calling and genotyping of large cohorts that have been sequenced at low coverage. Reveel introduces a novel technique for leveraging linkage disequilibrium that deviates from previous Markov-based models. We evaluate Reveel???s performance through extensive simulations as well as real data from the 1000 Genomes Project, and show that it achieves higher accuracy in low-frequency allele discovery and substantially lower computation cost than previous state-of-the-art methods.


Zootaxa ◽  
2017 ◽  
Vol 4362 (3) ◽  
pp. 385 ◽  
Author(s):  
VINH QUANG LUU ◽  
TRAN VAN DUNG ◽  
TRUONG QUANG NGUYEN ◽  
MINH DUC LE ◽  
THOMAS ZIEGLER

We describe a new species of the genus Cyrtodactylus from Gia Lai Province, Central Highlands of Vietnam based on morphological and molecular differences. Cyrtodactylus gialaiensis sp. nov. is differentiated from other congeners by a unique combination of the following characters: Size small, maximum known SVL reaching 62.8 mm; dorsal pattern consisting of six or seven dark transverse bands between limb insertions; intersupranasals two or three; dorsal tubercles at midbody in 16–21 irregular rows, strongly developed on flanks; lateral folds poorly defined with interspersed tubercles; ventral scales between ventrolateral folds 38–45; precloacal pores nine or 10 in males, eight pitted scales in the adult female, in a continuous row; femoral pores absent; enlarged femoral scales present; postcloacal tubercles two or three; dorsal tubercles present to half of tail; subcaudal scales not enlarged. In molecular analyses, the new species is weakly supported as a member of the Cyrtodactylus irregularis species group with a minimum pairwise genetic distance of 13.7% from others within the group. 


Sign in / Sign up

Export Citation Format

Share Document