scholarly journals Dissecting supergene: evidence for recombination suppression among multiple functional loci within inversions.

2021 ◽  
Author(s):  
Paul Jay ◽  
Manon Leroy ◽  
Yann Le Poul ◽  
Annabel Whibley ◽  
Monica Arias ◽  
...  

Supergenes are genetic architectures associated with discrete and concerted variation in multiple traits. It has long been suggested that supergenes control these complex polymorphisms by suppressing recombination between set of coadapted genes. However, because recombination suppression hinders the dissociation of the individual effects of genes within supergenes, there is still little evidence that supergenes evolve by tightening linkage between coadapted genes. Here, combining an landmark-free phenotyping algorithm with multivariate genome wide association studies, we dissected the genetic basis of wing pattern variation in the butterfly Heliconius numata. We showed that the supergene controlling the striking wing-pattern polymorphism displayed by this species contains many independent loci associated with different features of wing patterns. The three chromosomal inversions of this supergene suppress recombination between these loci, supporting the hypothesis that they may have evolved because they captured beneficial combinations of alleles. Some of these loci are associated with colour variations only in morphs controlled by inversions, indicating that they were recruited after the formation of these inversions. Our study shows that supergenes and clusters of adaptive loci in general may form via the evolution of chromosomal rearrangements suppressing recombination between co-adapted loci but also via the subsequent recruitment of linked adaptive mutations.

2019 ◽  
Author(s):  
Paul Jay ◽  
Mathieu Chouteau ◽  
Annabel Whibley ◽  
Héloïse Bastide ◽  
Violaine Llaurens ◽  
...  

While natural selection favours the fittest genotype, polymorphisms are maintained over evolutionary timescales in numerous species. Why these long-lived polymorphisms are often associated with chromosomal rearrangements remains obscure. Combining genome assemblies, population genomic analyses, and fitness assays, we studied the factors maintaining multiple mimetic morphs in the butterfly Heliconius numata. We show that the polymorphism is maintained because three chromosomal inversions controlling wing patterns express a recessive mutational load, which prevents their fixation despite their ecological advantage. Since inversions suppress recombination and hamper genetic purging, their formation fostered the capture and accumulation of deleterious variants. This suggests that many complex polymorphisms, instead of representing adaptations to the existence of alternative ecological optima, could be maintained primarily because chromosomal rearrangements are prone to carrying recessive harmful mutations.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Camilo Broc ◽  
Therese Truong ◽  
Benoit Liquet

Abstract Background The increasing number of genome-wide association studies (GWAS) has revealed several loci that are associated to multiple distinct phenotypes, suggesting the existence of pleiotropic effects. Highlighting these cross-phenotype genetic associations could help to identify and understand common biological mechanisms underlying some diseases. Common approaches test the association between genetic variants and multiple traits at the SNP level. In this paper, we propose a novel gene- and a pathway-level approach in the case where several independent GWAS on independent traits are available. The method is based on a generalization of the sparse group Partial Least Squares (sgPLS) to take into account groups of variables, and a Lasso penalization that links all independent data sets. This method, called joint-sgPLS, is able to convincingly detect signal at the variable level and at the group level. Results Our method has the advantage to propose a global readable model while coping with the architecture of data. It can outperform traditional methods and provides a wider insight in terms of a priori information. We compared the performance of the proposed method to other benchmark methods on simulated data and gave an example of application on real data with the aim to highlight common susceptibility variants to breast and thyroid cancers. Conclusion The joint-sgPLS shows interesting properties for detecting a signal. As an extension of the PLS, the method is suited for data with a large number of variables. The choice of Lasso penalization copes with architectures of groups of variables and observations sets. Furthermore, although the method has been applied to a genetic study, its formulation is adapted to any data with high number of variables and an exposed a priori architecture in other application fields.


2020 ◽  
Author(s):  
Shelly Lazar ◽  
Manas Ranjan Prusty ◽  
Khaled Bishara ◽  
Amir Sherman ◽  
Eyal Fridman

AbstractGenetic loci underlying variation in traits with agronomic importance or genetic risk factors in human diseases have been identified by linkage analysis and genome-wide association studies. However, narrowing down the mapping to the individual causal genes and variations within these is much more challenging, and so is the ability to break linkage drag between beneficial and unfavourable loci in crop breeding. We developed RECAS9 as a transgene-free approach for precisely targeting recombination events by delivering CRISPR/Cas9 ribonucleotide protein (RNP) complex into heterozygous mitotic cells for the barley (Hordeum vulgare) Heat3.1 locus. A wild species (H. spontaneum) introgression in this region carries the agronomical unfavourable tough rachis phenotype (non-brittle) allele linked with a circadian clock accelerating QTL near GIGANTEA gene. We delivered RNP, which was targeted between two single nucleotide polymorphism (SNPs), to mitotic calli cells by particle bombardment. We estimated recombination events by next generation sequencing (NGS) and droplet digital PCR (ddPCR). While NGS analysis grieved from confounding effects of PCR recombination, ddPCR analysis allowed us to associate RNP treatment on heterozygous individuals with significant increase of homologous directed repair (HDR) between cultivated and wild alleles, with recombination rate ranging between zero to 57%. These results show for the first time in plants a directed and transgene free mitotic recombination driven by Cas9 RNP, and provide a starting point for precise breeding and fine scale mapping of beneficial alleles from crop wild relatives.


Author(s):  
Fadhaa Ali ◽  
Jian Zhang

AbstractMultilocus haplotype analysis of candidate variants with genome wide association studies (GWAS) data may provide evidence of association with disease, even when the individual loci themselves do not. Unfortunately, when a large number of candidate variants are investigated, identifying risk haplotypes can be very difficult. To meet the challenge, a number of approaches have been put forward in recent years. However, most of them are not directly linked to the disease-penetrances of haplotypes and thus may not be efficient. To fill this gap, we propose a mixture model-based approach for detecting risk haplotypes. Under the mixture model, haplotypes are clustered directly according to their estimated disease penetrances. A theoretical justification of the above model is provided. Furthermore, we introduce a hypothesis test for haplotype inheritance patterns which underpin this model. The performance of the proposed approach is evaluated by simulations and real data analysis. The results show that the proposed approach outperforms an existing multiple testing method.


2015 ◽  
Author(s):  
Oriol Canela-Xandri ◽  
Konrad Rawlik ◽  
John A. Woolliams ◽  
Albert Tenesa

Genome-wide association studies (GWAS) promised to translate their findings into clinically beneficial improvements of patient management by tailoring disease management to the individual through the prediction of disease risk. However, the ability to translate genetic findings from GWAS into predictive tools that are of clinical utility and which may inform clinical practice has, so far, been encouraging but limited. Here we propose to use a more powerful statistical approach that enables the prediction of multiple medically relevant phenotypes without the costs associated with developing a genetic test for each of them. As a proof of principle, we used a common panel of 319,038 SNPs to train the prediction models in 114,264 unrelated White-British for height and four obesity related traits (body mass index, basal metabolic rate, body fat percentage, and waist-to-hip ratio). We obtained prediction accuracies that ranged between 46% and 75% of the maximum achievable given their explained heritable component. This represents an improvement of up to 75% over the phenotypic variance explained by the predictors developed through large collaborations, which used more than twice as many training samples. Across-population predictions in White non-British individuals were similar to those of White-British whilst those in Asian and Black individuals were informative but less accurate. The genotyping of circa 500,000 UK Biobank participants will yield predictions ranging between 66% and 83% of the maximum. We anticipate that our models and a common panel of genetic markers, which can be used across multiple traits and diseases, will be the starting point to tailor disease management to the individual. Ultimately, we will be able to capitalise on whole-genome sequence and environmental risk factors to realise the full potential of genomic medicine.


2018 ◽  
Author(s):  
Mashaal Sohail ◽  
Robert M. Maier ◽  
Andrea Ganna ◽  
Alex Bloemendal ◽  
Alicia R. Martin ◽  
...  

AbstractGenetic predictions of height differ among human populations and these differences are too large to be explained by genetic drift. This observation has been interpreted as evidence of polygenic adaptation. Differences across populations were detected using SNPs genome-wide significantly associated with height, and many studies also found that the signals grew stronger when large numbers of subsignificant SNPs were analyzed. This has led to excitement about the prospect of analyzing large fractions of the genome to detect subtle signals of selection and claims of polygenic adaptation for multiple traits. Polygenic adaptation studies of height have been based on SNP effect size measurements in the GIANT Consortium meta-analysis. Here we repeat the height analyses in the UK Biobank, a much more homogeneously designed study. Our results show that polygenic adaptation signals based on large numbers of SNPs below genome-wide significance are extremely sensitive to biases due to uncorrected population structure.


2019 ◽  
Author(s):  
Michael C. Turchin ◽  
Matthew Stephens

AbstractGenome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is de-spite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS.1Author SummaryGenome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.


2018 ◽  
Author(s):  
Ping Zeng ◽  
Xinjie Hao ◽  
Xiang Zhou

AbstractMotivationGenome-wide association studies (GWASs) have identified many genetic loci associated with complex traits. A substantial fraction of these identified loci are associated with multiple traits – a phenomena known as pleiotropy. Identification of pleiotropic associations can help characterize the genetic relationship among complex traits and can facilitate our understanding of disease etiology. Effective pleiotropic association mapping requires the development of statistical methods that can jointly model multiple traits with genome-wide SNPs together.ResultsWe develop a joint modeling method, which we refer to as the integrative MApping of Pleiotropic association (iMAP). iMAP models summary statistics from GWASs, uses a multivariate Gaussian distribution to account for phenotypic correlation, simultaneously infers genome-wide SNP association pattern using mixture modeling, and has the potential to reveal causal relationship between traits. Importantly, iMAP integrates a large number of SNP functional annotations to substantially improve association mapping power, and, with a sparsity-inducing penalty, is capable of selecting informative annotations from a large, potentially noninformative set. To enable scalable inference of iMAP to association studies with hundreds of thousands of individuals and millions of SNPs, we develop an efficient expectation maximization algorithm based on an approximate penalized regression algorithm. With simulations and comparisons to existing methods, we illustrate the benefits of iMAP both in terms of high association mapping power and in terms of accurate estimation of genome-wide SNP association patterns. Finally, we apply iMAP to perform a joint analysis of 48 traits from 31 GWAS consortia together with 40 tissue-specific SNP annotations generated from the Roadmap Project. iMAP is freely available at www.xzlab.org/software.html.


2016 ◽  
Author(s):  
Francesco Paolo Casale ◽  
Danilo Horta ◽  
Barbara Rakitsch ◽  
Oliver Stegle

AbstractJoint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait models have been designed to increase power for detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we propose iSet, a method based on linear mixed models to test for interactions between sets of variants and environmental states or other contexts. Our model generalizes previous interaction tests and in particular provides a test for local differences in the genetic architecture between contexts. We first use simulations to validate iSet before applying the model to the analysis of genotype-environment interactions in an eQTL study. Our model retrieves a larger number of interactions than alternative methods and reveals that up to 20% of cases show context-specific configurations of causal variants. Finally, we apply iSet to test for sub-group specific genetic effects in human lipid levels in a large human cohort, where we identify a gene-sex interaction for C-reactive protein that is missed by alternative methods.Author summaryGenetic effects on phenotypes can depend on external contexts, including environment. Statistical tests for identifying such interactions are important to understand how individual genetic variants may act in different contexts. Interaction effects can either be studied using measurements of a given phenotype in different contexts, under the same genetic backgrounds, or by stratifying a population into subgroups. Here, we derive a method based on linear mixed models that can be applied to both of these designs. iSet enables testing for interactions between context and sets of variants, and accounts for polygenic effects. We validate our model using simulations, before applying it to the genetic analysis of gene expression studies and genome-wide association studies of human blood lipid levels. We find that modeling interactions with variant sets offers increased power, thereby uncovering interactions that cannot be detected by alternative methods.


Sign in / Sign up

Export Citation Format

Share Document