scholarly journals Allele frequency divergence reveals ubiquitous influence of positive selection in Drosophila

2021 ◽  
Author(s):  
Jason Bertram

Resolving the role of natural selection is a basic objective of evolutionary biology. It is generally difficult to detect the influence of selection because ubiquitous non-selective stochastic change in allele frequencies (genetic drift) degrades evidence of selection. As a result, selection scans typically only identify genomic regions that have undergone episodes of intense selection. Yet it seems likely such episodes are the exception; the norm is more likely to involve subtle, concurrent selective changes at a large number of loci. We develop a new theoretical approach that uncovers a previously undocumented genome-wide signature of selection in the collective divergence of allele frequencies over time. Applying our approach to temporally-resolved allele frequency measurements from laboratory and wild Drosophila populations, we quantify the selective contribution to allele frequency divergence and find that selection has substantial effects on much of the genome. We further quantify the magnitude of the total selection coefficient (a measure of the combined effects of direct and linked selection) at a typical polymorphic locus, and find this to be large (of order 1%) even though most mutations are not directly under selection. We find that selective allele frequency divergence is substantial at intermediate allele frequencies, which we argue is most parsimoniously explained by positive --- not purifying --- selection. Thus, in these populations most mutations are far from evolving neutrally in the short term (tens of generations), including mutations with neutral fitness effects, and the result cannot be explained simply as a purging of deleterious mutations.

PLoS Genetics ◽  
2021 ◽  
Vol 17 (9) ◽  
pp. e1009833
Author(s):  
Jason Bertram

Resolving the role of natural selection is a basic objective of evolutionary biology. It is generally difficult to detect the influence of selection because ubiquitous non-selective stochastic change in allele frequencies (genetic drift) degrades evidence of selection. As a result, selection scans typically only identify genomic regions that have undergone episodes of intense selection. Yet it seems likely such episodes are the exception; the norm is more likely to involve subtle, concurrent selective changes at a large number of loci. We develop a new theoretical approach that uncovers a previously undocumented genome-wide signature of selection in the collective divergence of allele frequencies over time. Applying our approach to temporally resolved allele frequency measurements from laboratory and wild Drosophila populations, we quantify the selective contribution to allele frequency divergence and find that selection has substantial effects on much of the genome. We further quantify the magnitude of the total selection coefficient (a measure of the combined effects of direct and linked selection) at a typical polymorphic locus, and find this to be large (of order 1%) even though most mutations are not directly under selection. We find that selective allele frequency divergence is substantially elevated at intermediate allele frequencies, which we argue is most parsimoniously explained by positive—not negative—selection. Thus, in these populations most mutations are far from evolving neutrally in the short term (tens of generations), including mutations with neutral fitness effects, and the result cannot be explained simply as an ongoing purging of deleterious mutations.


2020 ◽  
Vol 61 (1) ◽  
pp. 17-23
Author(s):  
Michelle M. Nay ◽  
Stephen L. Byrne ◽  
Eduardo A. Pérez ◽  
Achim Walter ◽  
Bruno Studer

Genomics-assisted breeding of buckwheat (Fagopyrum esculentum Moench) depends on robust genotyping methods. Genotyping by sequencing (GBS) has evolved as a flexible and cost-effective technique frequently used in plant breeding. Several GBS pipelines are available to genetically characterize single genotypes but these are not able to represent the genetic diversity of buckwheat accessions that are maintained as genetically heterogeneous, open-pollinating populations. Here we report the development of a GBS pipeline which, rather than reporting the state of bi-allelic single nucleotide polymorphisms (SNPs), resolves allele frequencies within populations on a genome-wide scale. These genome-wide allele frequency fingerprints (GWAFFs) from 100 pooled individual plants per accession were found to be highly reproducible and revealed the genetic similarity of 20 different buckwheat accessions analysed in our study. The GWAFFs cannot only be used as an efficient tool to precisely describe buckwheat breeding material, they also offer new opportunities to investigate the genetic diversity between different buckwheat accessions and establish variant databases for key material. Furthermore, GWAFFs provide the opportunity to associate allele frequencies to phenotypic traits and quality parameters that are most reliably described on population level. This is the key to practically implement powerful genomics-assisted breeding concepts such as marker-assisted selection and genomic selection in future breeding schemes of allogamous buckwheat. Key words: Buckwheat (Fagopyrum esculentum Moench), genotyping by sequencing (GBS), population genomics, genome-wide allele frequency fingerprints (GWAFFs)   Izvleček Genomsko podprto žlahtnjenje ajde (Fagopyrum esculentum Moench) je odvisno od robustnih metod genotipiziranja. Genotipiziranje s spremljanjem sekvenc (genotyping by sequencing, GBS) se je razvilo kot fleksibilna in razmeroma poceni metoda, ki se jo uporablja pri žlahtnjenju rastlin. Uporabnih je več virov GBS za genetsko karakterizacijo posamičnih genotipov, toda te metode niso primerne za predstavitev genetske raznolikosti vzorcev ajde, ki jih vzdržujemo v heterozigotni obliki, kar velja za odprto oplodne populacije. Tu poročamo o razvoju GBS metode, ki, namesto prikazovanja bi-alelnega polimorfizma posameznih nukleotidov (single nucleotide polymorphisms, SNPs), pokaže frekvence alelov v populaciji na nivoju genoma. Ta prikaz frekvence alelov na nivoju genoma (genome-wide allele frequency fingerprints, GWAFFs) z združenimi sto posameznimi rastlinami vsakega vzorca se je pokazal kot visoko ponovljiv in je prikazal genetsko podobnost 20 različnih vzorcev ajde, ki smo jih analizirali v naši raziskavi. Metoda GWAFFs ni uporabna samo kot učinkovito orodje za natančen opis materiala za žlahtnjenje ajde, ponuja tudi možnosti raziskave  genetskih razlik med različnimi vzorci ajde in omogoča zbirke podatkov. Nadalje, metoda GWAFFs omogoča povezovanje frekvenc alelov s fenotipskimi lastnostmi in kvalitativnih parametrov, ki so najbolj zanesljivo opisani na nivoju populacij. To je ključ za praktično uporabo z genomiko podprtega žlahtnjenja, kot je z genskimi markerji podprta selekcija in genomska selekcija z GWAFFs. Ključne besede: ajda (Fagopyrum esculentum Moench), genotipizacija s sekvenciranjem (GBS), populacijska genomika, GWAFFs


2020 ◽  
Author(s):  
Wouter J. Peyrot ◽  
Alkes L. Price

AbstractPsychiatric disorders are highly genetically correlated, and many studies have focused on their shared genetic components. However, little research has been conducted on the genetic differences between psychiatric disorders, because case-case comparisons of allele frequencies among cases currently require individual-level data from cases of both disorders. We developed a new method (CC-GWAS) to test for differences in allele frequency among cases of two different disorders using summary statistics from the respective case-control GWAS; CC-GWAS relies on analytical assessments of the genetic distance between cases and controls of each disorder. Simulations and analytical computations confirm that CC-GWAS is well-powered and attains effective control of type I error. In particular, CC-GWAS identifies and discards false positive associations that can arise due to differential tagging of a shared causal SNP (with the same allele frequency in cases of both disorders), e.g. due to subtle differences in ancestry between the input case-control studies. We applied CC-GWAS to publicly available summary statistics for schizophrenia, bipolar disorder and major depressive disorder, and identified 116 independent genome-wide significant loci distinguishing these three disorders, including 21 CC-GWAS-specific loci that were not genome-wide significant in the input case-control summary statistics. Two of the CC-GWAS-specific loci implicate the genes KLF6 and KLF16 from the Kruppel-like family of transcription factors; these genes have been linked to neurite outgrowth and axon regeneration. We performed a broader set of case-case comparisons by additionally analyzing ADHD, anorexia nervosa, autism, obsessive-compulsive disorder and Tourette’s Syndrome, yielding a total of 196 independent loci distinguishing eight psychiatric disorders, including 72 CC-GWAS-specific loci. We confirmed that loci identified by CC-GWAS replicated convincingly in applications to data sets for which independent replication data were available. In conclusion, CC-GWAS robustly identifies loci with different allele frequencies among cases of different disorders using results from the respective case-control GWAS, providing new insights into the genetic differences between eight psychiatric disorders.


2019 ◽  
Author(s):  
Vince Buffalo ◽  
Graham Coop

AbstractRapid phenotypic adaptation is often observed in natural populations and selection experiments. However, detecting the genome-wide impact of this selection is difficult, since adaptation often proceeds from standing variation and selection on polygenic traits, both of which may leave faint genomic signals indistinguishable from a noisy background of genetic drift. One promising signal comes from the genome-wide covariance between allele frequency changes observable from temporal genomic data, e.g. evolve-and-resequence studies. These temporal covariances reflect how heritable fitness variation in the population leads changes in allele frequencies at one timepoint to be predictive of the changes at later timepoints, as alleles are indirectly selected due to remaining associations with selected alleles. Since genetic drift does not lead to temporal covariance, we can use these covariances to estimate what fraction of the variation in allele frequency change through time is driven by linked selection. Here, we reanalyze three selection experiments to quantify the effects of linked selection over short timescales using covariance among time-points and across replicates. We estimate that at least 17% to 37% of allele frequency change is driven by selection in these experiments. Against this background of positive genome-wide temporal covariances we also identify signals of negative temporal covariance corresponding to reversals in the direction of selection for a reasonable proportion of loci over the time course of a selection experiment. Overall, we find that in the three studies we analyzed, linked selection has a large impact on short-term allele frequency dynamics that is readily distinguishable from genetic drift.Significance StatementA long-standing problem in evolutionary biology is to understand the processes that shape the genetic composition of populations. In a population without migration, the two processes that change allele frequencies are selection, which increases beneficial alleles and removes deleterious ones, and genetic drift which randomly changes frequencies as some parents contribute more or less alleles to the next generation. Previous efforts to disentangle these processes have used genomic samples from a single timepoint and models of how selection affects neighboring sites (linked selection). Here, we use genomic data taken through time to quantify the contributions of selection and drift to genome-wide frequency changes. We show selection acts over short timescales in three evolve-and-resequence studies and has a sizable genome-wide impact.


2020 ◽  
Author(s):  
Murillo F. Rodrigues ◽  
Maria D. Vibranovski ◽  
Rodrigo Cogni

AbstractSpatial and seasonal variation in the environment are ubiquitous. Environmental heterogeneity can affect natural populations and lead to covariation between environment and allele frequencies. Drosophila melanogaster is known to harbor polymorphisms that change both with latitude and seasons. Identifying the role of selection in driving these changes is not trivial, because non-adaptive processes can cause similar patterns. Given the environment changes in similar ways across seasons and along the latitudinal gradient, one promising approach may be to look for parallelism between clinal and seasonal change. Here, we test whether there is a genome-wide relationship between clinal and seasonal variation, and whether the pattern is consistent with selection. We investigate the role of natural selection in driving these allele frequency changes. Allele frequency estimates were obtained from pooled samples from seven different locations along the east coast of the US, and across seasons within Pennsylvania. We show that there is a genome-wide pattern of clinal variation mirroring seasonal variation, which cannot be explained by linked selection alone. This pattern is stronger for coding than intergenic regions, consistent with natural selection. We find that the genome-wide relationship between clinal and seasonal variation could be explained by about 4% of the common autosomal variants being under selection. Our results highlight the contribution of natural selection in driving fluctuations in allele frequencies in D. melanogaster.


2011 ◽  
Vol 279 (1732) ◽  
pp. 1277-1286 ◽  
Author(s):  
Bruce E. Deagle ◽  
Felicity C. Jones ◽  
Yingguang F. Chan ◽  
Devin M. Absher ◽  
David M. Kingsley ◽  
...  

Understanding the genetics of adaptation is a central focus in evolutionary biology. Here, we use a population genomics approach to examine striking parallel morphological divergences of parapatric stream–lake ecotypes of threespine stickleback fish in three watersheds on the Haida Gwaii archipelago, western Canada. Genome-wide variation at greater than 1000 single nucleotide polymorphism loci indicate separate origin of giant lake and small-bodied stream fish within each watershed (mean F ST between watersheds = 0.244 and within = 0.114). Genome scans within watersheds identified a total of 21 genomic regions that are highly differentiated between ecotypes and are probably subject to directional selection. Most outliers were watershed-specific, but genomic regions undergoing parallel genetic changes in multiple watersheds were also identified. Interestingly, several of the stream–lake outlier regions match those previously identified in marine–freshwater and benthic–limnetic genome scans, indicating reuse of the same genetic loci in different adaptive scenarios. We also identified multiple new outlier loci, which may contribute to unique aspects of differentiation in stream–lake environments. Overall, our data emphasize the important role of ecological boundaries in driving both local and broadly occurring parallel genetic changes during adaptation.


2017 ◽  
Author(s):  
Antoine Guérin ◽  
Gaspard Kerner ◽  
Nico Marr ◽  
Janet G. Markle ◽  
Florence Fenollar ◽  
...  

AbstractThe pathogenesis of Whipple’s disease (WD) remains largely unknown, as WD strikes only a very small minority of the individuals infected with Tropheryma whipplei (Tw). Asymptomatic carriage of Tw is less rare. We studied a large multiplex French kindred, containing four otherwise healthy WD patients (mean age: 76.7 years) and five healthy carriers of Tw (mean age: 55 years). We used a strategy combining genome-wide linkage analysis and whole-exome sequencing to test the hypothesis that WD is inherited in an autosomal dominant (AD) manner, with age-dependent incomplete penetrance. WD was linked to 12 genomic regions covering 27 megabases in the four patients. These regions contained only one very rare non-synonymous variation: the R98W variant of IRF4. The five Tw carriers were heterozygous for R98W. Interferon regulatory factor 4 (IRF4) is a transcription factor with pleiotropic roles in immunity. We showed that R98W was a loss-of-function allele, like only five other exceedingly rare IRF4 alleles of a total of 39 rare and common non-synonymous alleles tested. Furthermore, heterozygosity for R98W led to a distinctive pattern of transcription in leukocytes following stimulation with BCG or Tw. Finally, we found that IRF4 had evolved under purifying selection and that R98W was not dominant-negative, suggesting that the IRF4 deficiency in this kindred was due to haploinsufficiency. Overall, haploinsufficiency at the IRF4 locus selectively underlies WD in this multiplex kindred. This deficiency displays AD inheritance with incomplete penetrance, and chronic carriage probably precedes WD by several decades in Tw-infected heterozygotes.


2020 ◽  
Author(s):  
Tristan J. Hayeck ◽  
Nicholas Stong ◽  
Evan Baugh ◽  
Ryan Dhindsa ◽  
Tychele N. Turner ◽  
...  

AbstractGenomic regions subject to purifying selection are more likely to carry disease causing mutations. Cross species conservation is often used to identify such regions but has limited resolution to detect selection on short evolutionary timescales such as that occurring in only one species. In contrast, intolerance looks for depletion of variation relative to expectation within a species, allowing species specific features to be identified. When estimating the intolerance of noncoding sequence methods strongly leverage variant frequency distributions. As the expected distributions depend on demography, if not properly controlled for, ancestral population source may obfuscate signals of selection. We demonstrate that properly incorporating demography in intolerance estimation greatly improved variant classification (13% increase in AUC relative to comparison constraint test, CDTS; and 9% relative to conservation). We provide a genome-wide intolerance map that is conditional on demographic history that is likely to be particularly valuable for variant prioritization.


Author(s):  
Ricardo Pereira ◽  
Thiago Lima ◽  
N Pierce ◽  
Lin Chao ◽  
Ronald Burton

Reproductive isolation is often achieved when genes that are neutral or beneficial in their genomic background become functionally incompatible in a foreign genome, causing inviability, sterility or low fitness in hybrids. Recent studies suggest that mitonuclear interactions are among the initial incompatibilities to evolve at early stages of population divergence across taxa. Yet, it is unclear whether mitonuclear incompatibilities involve few or many regions in the nuclear genome. We employ an experimental evolution approach starting with unfit F2 interpopulation hybrids of the copepod Tigriopus californicus, in which compatible and incompatible nuclear alleles compete in a fixed mitochondrial background. After about nine generations, we observe a generalized increase in population size and in survivorship, suggesting efficiency of selection against maladaptive phenotypes. Whole genome sequencing of evolved populations showed some consistent allele frequency changes across the three replicates of each reciprocal cross, but markedly different patterns between mitochondrial background. In only a few regions (~6.5% of the genome), the same parental allele was overrepresented irrespective of the mitochondrial background. About 33% of the genome shows allele frequency changes consistent with divergent selection, with the location of these genomic regions strongly differing between mitochondrial backgrounds. The dominant allele matches the mitochondrial background in 87 and 89% of these genomic regions, consistent with mitonuclear coadaptation. These results suggest that mitonuclear incompatibilities have a complex polygenic architecture that differs between populations, potentially generating genome wide barriers to gene flow between closely related taxa.


Sign in / Sign up

Export Citation Format

Share Document