scholarly journals High-throughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability

2018 ◽  
Author(s):  
Pier Francesco Palamara ◽  
Jonathan Terhorst ◽  
Yun S. Song ◽  
Alkes L. Price

AbstractInterest in reconstructing demographic histories has motivated the development of methods to estimate locus-specific pairwise coalescence times from whole-genome sequence data. We developed a new method, ASMC, that can estimate coalescence times using only SNP array data, and is 2-4 orders of magnitude faster than previous methods when sequencing data are available. We were thus able to apply ASMC to 113,851 phased British samples from the UK Biobank, aiming to detect recent positive selection by identifying loci with unusually high density of very recent coalescence times. We detected 12 genome-wide significant signals, including 6 loci with previous evidence of positive selection and 6 novel loci, consistent with coalescent simulations showing that our approach is well-powered to detect recent positive selection. We also applied ASMC to sequencing data from 498 Dutch individuals (Genome of the Netherlands data set) to detect background selection at deeper time scales. We observed highly significant correlations between average coalescence time inferred by ASMC and other measures of background selection. We investigated whether this signal translated into an enrichment in disease and complex trait heritability by analyzing summary association statistics from 20 independent diseases and complex traits (average N=86k) using stratified LD score regression. Our background selection annotation based on average coalescence time was strongly enriched for heritability (p = 7×10−153) in a joint analysis conditioned on a broad set of functional annotations (including other background selection annotations), meta-analyzed across traits; SNPs in the top 20% of our annotation were 3.8x enriched for heritability compared to the bottom 20%. These results underscore the widespread effects of background selection on disease and complex trait heritability.

Animals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1803
Author(s):  
Valentino Palombo ◽  
Elena De Zio ◽  
Giovanna Salvatore ◽  
Stefano Esposito ◽  
Nicolaia Iaffaldano ◽  
...  

Mediterranean trout is a freshwater fish of particular interest with economic significance for fishery management, aquaculture and conservation biology. Unfortunately, native trout populations’ abundance is significantly threatened by anthropogenic disturbance. The introduction of commercial hatchery strains for recreation activities has compromised the genetic integrity status of native populations. This work assessed the fine-scale genetic structure of Mediterranean trout in the two main rivers of Molise region (Italy) to support conservation actions. In total, 288 specimens were caught in 28 different sites (14 per basins) and genotyped using the Affymetrix 57 K rainbow-trout-derived SNP array. Population differentiation was analyzed using pairwise weighted FST and overall F-statistic estimated by locus-by-locus analysis of molecular variance. Furthermore, an SNP data set was processed through principal coordinates analysis, discriminant analysis of principal components and admixture Bayesian clustering analysis. Firstly, our results demonstrated that rainbow trout SNP array can be successfully used for Mediterranean trout genotyping. In fact, despite an overwhelming number of loci that resulted as monomorphic in our populations, it must be emphasized that the resulted number of polymorphic loci (i.e., ~900 SNPs) has been sufficient to reveal a fine-scale genetic structure in the investigated populations, which is useful in supporting conservation and management actions. In particular, our findings allowed us to select candidate sites for the collection of adults, needed for the production of genetically pure juvenile trout, and sites to carry out the eradication of alien trout and successive re-introduction of native trout.


2008 ◽  
Vol 20 (5) ◽  
pp. 1211-1238 ◽  
Author(s):  
Gaby Schneider

Oscillatory correlograms are widely used to study neuronal activity that shows a joint periodic rhythm. In most cases, the statistical analysis of cross-correlation histograms (CCH) features is based on the null model of independent processes, and the resulting conclusions about the underlying processes remain qualitative. Therefore, we propose a spike train model for synchronous oscillatory firing activity that directly links characteristics of the CCH to parameters of the underlying processes. The model focuses particularly on asymmetric central peaks, which differ in slope and width on the two sides. Asymmetric peaks can be associated with phase offsets in the (sub-) millisecond range. These spatiotemporal firing patterns can be highly consistent across units yet invisible in the underlying processes. The proposed model includes a single temporal parameter that accounts for this peak asymmetry. The model provides approaches for the analysis of oscillatory correlograms, taking into account dependencies and nonstationarities in the underlying processes. In particular, the auto- and the cross-correlogram can be investigated in a joint analysis because they depend on the same spike train parameters. Particular temporal interactions such as the degree to which different units synchronize in a common oscillatory rhythm can also be investigated. The analysis is demonstrated by application to a simulated data set.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Fan Li ◽  
Yunyun Lv ◽  
Zhengyong Wen ◽  
Chao Bian ◽  
Xinhui Zhang ◽  
...  

Abstract Background Although almost all extant spider species live in terrestrial environments, a few species live fully submerged in freshwater or seawater. The intertidal spiders (genus Desis) built silk nests within coral crevices can survive submerged in high tides. The diving bell spider, Argyroneta aquatica, resides in a similar dynamic environment but exclusively in freshwater. Given the pivotal role played by mitochondria in supplying most energy for physiological activity via oxidative phosphorylation and the environment, herein we sequenced the complete mitogenome of Desis jiaxiangi to investigate the adaptive evolution of the aquatic spider mitogenomes and the evolution of spiders. Results We assembled a complete mitogenome of the intertidal spider Desis jiaxiangi and performed comparative mitochondrial analyses of data set comprising of Desis jiaxiangi and other 45 previously published spider mitogenome sequences, including that of Argyroneta aquatica. We found a unique transposition of trnL2 and trnN genes in Desis jiaxiangi. Our robust phylogenetic topology clearly deciphered the evolutionary relationships between Desis jiaxiangi and Argyroneta aquatica as well as other spiders. We dated the divergence of Desis jiaxiangi and Argyroneta aquatica to the late Cretaceous at ~ 98 Ma. Our selection analyses detected a positive selection signal in the nd4 gene of the aquatic branch comprising both Desis jiaxiangi and Argyroneta aquatica. Surprisingly, Pirata subpiraticus, Hypochilus thorelli, and Argyroneta aquatica each had a higher Ka/Ks value in the 13 PCGs dataset among 46 taxa with complete mitogenomes, and these three species also showed positive selection signal in the nd6 gene. Conclusions Our finding of the unique transposition of trnL2 and trnN genes indicates that these genes may have experienced rearrangements in the history of intertidal spider evolution. The positive selection signals in the nd4 and nd6 genes might enable a better understanding of the spider metabolic adaptations in relation to different environments. Our construction of a novel mitogenome for the intertidal spider thus sheds light on the evolutionary history of spiders and their mitogenomes.


Genetics ◽  
1996 ◽  
Vol 142 (4) ◽  
pp. 1357-1362
Author(s):  
François Rousset

Abstract Expected values of Wright'sF-statistics are functions of probabilities of identity in state. These values may be quite different under an infinite allele model and under stepwise mutation processes such as those occurring at microsatellite loci. However, a relationship between the probability of identity in state in stepwise mutation models and the distribution of coalescence times can be deduced from the relationship between probabilities of identity by descent and the distribution of coalescence times. The values of FIS and FST can be computed using this property. Examination of the conditional probability of identity in state given some coalescence time and of the distribution of coalescence times are also useful for explaining the properties of FIS and FST at high mutation rate loci, as shown here in an island model of population structure.


2017 ◽  
Vol 34 (8) ◽  
pp. 1936-1946 ◽  
Author(s):  
Kazuhiro Nakayama ◽  
Jun Ohashi ◽  
Kazuhisa Watanabe ◽  
Lkagvasuren Munkhtulga ◽  
Sadahiko Iwamoto

2017 ◽  
Author(s):  
Zilu Zhou ◽  
Weixin Wang ◽  
Li-San Wang ◽  
Nancy Ruonan Zhang

AbstractMotivationCopy number variations (CNVs) are gains and losses of DNA segments and have been associated with disease. Many large-scale genetic association studies are performing CNV analysis using whole exome sequencing (WES) and whole genome sequencing (WGS). In many of these studies, previous SNP-array data are available. An integrated cross-platform analysis is expected to improve resolution and accuracy, yet there is no tool for effectively combining data from sequencing and array platforms. The detection of CNVs using sequencing data alone can also be further improved by the utilization of allele-specific reads.ResultsWe propose a statistical framework, integrated Copy Number Variation detection algorithm (iCNV), which can be applied to multiple study designs: WES only, WGS only, SNP array only, or any combination of SNP and sequencing data. iCNV applies platform specific normalization, utilizes allele specific reads from sequencing and integrates matched NGS and SNP-array data by a Hidden Markov Model (HMM). We compare integrated two-platform CNV detection using iCNV to naive intersection or union of platforms and show that iCNV increases sensitivity and robustness. We also assess the accuracy of iCNV on WGS data only, and show that the utilization of allele-specific reads improve CNV detection accuracy compared to existing methods.Availabilityhttps://github.com/zhouzilu/[email protected], [email protected] informationSupplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Daniel P Cooke ◽  
David C Wedge ◽  
Gerton Lunter

Haplotype-based variant callers, which consider physical linkage between variant sites, are currently among the best tools for germline variation discovery and genotyping from short-read sequencing data. However, almost all such tools were designed specifically for detecting common germline variation in diploid populations, and give sub-optimal results in other scenarios. Here we present Octopus, a versatile haplotype-based variant caller that uses a polymorphic Bayesian genotyping model capable of modeling sequencing data from a range of experimental designs within a unified haplotype-aware framework. We show that Octopus accurately calls de novo mutations in parent-offspring trios and germline variants in individuals, including SNVs, indels, and small complex replacements such as microinversions. In addition, using a carefully designed synthetic-tumour data set derived from clean sequencing data from a sample with known germline haplotypes, and observed mutations in large cohort of tumour samples, we show that Octopus accurately characterizes germline and somatic variation in tumours, both with and without a paired normal sample. Sequencing reads and prior information are combined to phase called genotypes of arbitrary ploidy, including those with somatic mutations. Octopus also outputs realigned evidence BAMs to aid validation and interpretation.


Sign in / Sign up

Export Citation Format

Share Document