scholarly journals Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data

2016 ◽  
Author(s):  
Yi-Fei Huang ◽  
Brad Gulko ◽  
Adam Siepel

AbstractAcross many species, a large fraction of genetic variants that influence phenotypes of interest is located outside of protein-coding genes, yet existing methods for identifying such variants have poor predictive power. Here, we introduce a new computational method, called LINSIGHT, that substantially improves the prediction of noncoding nucleotide sites at which mutations are likely to have deleterious fitness consequences, and which therefore are likely to be phenotypically important. LINSIGHT combines a simple neural network for functional genomic data with a probabilistic model of molecular evolution. The method is fast and highly scalable, enabling it to exploit the “Big Data” available in modern genomics. We show that LINSIGHT outperforms the best available methods in identifying human noncoding variants associated with inherited diseases. In addition, we apply LINSIGHT to an atlas of human enhancers and show that the fitness consequences at enhancers depend on cell-type, tissue specificity, and constraints at associated promoters.

2021 ◽  
Author(s):  
Troy M LaPolice ◽  
Yi-Fei Huang

Being able to predict essential genes intolerant to loss-of-function (LOF) mutations can dramatically improve our ability to identify genes associated with genetic disorders. Numerous computational methods have recently been developed to predict human essential genes from population genomic data; however, the existing methods have limited power in pinpointing short essential genes due to the sparsity of polymorphisms in the human genome. Here we present an evolution-based deep learning model, DeepLOF, which integrates population and functional genomic data to improve gene essentiality prediction. Compared to previous methods, DeepLOF shows unmatched performance in predicting ClinGen haploinsufficient genes, mouse essential genes, and essential genes in human cell lines. Furthermore, DeepLOF discovers 109 potentially essential genes that are too short to be identified by previous methods. Altogether, DeepLOF is a powerful computational method to aid in the discovery of essential genes.


2020 ◽  
Vol 13 (10) ◽  
pp. 2821-2835
Author(s):  
Lei Chen ◽  
Jing‐Tao Sun ◽  
Peng‐Yu Jin ◽  
Ary A. Hoffmann ◽  
Xiao‐Li Bing ◽  
...  

2008 ◽  
Vol 40 (7) ◽  
pp. 854-861 ◽  
Author(s):  
Jun Zhu ◽  
Bin Zhang ◽  
Erin N Smith ◽  
Becky Drees ◽  
Rachel B Brem ◽  
...  

Author(s):  
Jesper Svedberg ◽  
Vladimir Shchur ◽  
Solomon Reinman ◽  
Rasmus Nielsen ◽  
Russell Corbett-Detig

AbstractAdaptive introgression - the flow of adaptive genetic variation between species or populations - has attracted significant interest in recent years and it has been implicated in a number of cases of adaptation, from pesticide resistance and immunity, to local adaptation. Despite this, methods for identification of adaptive introgression from population genomic data are lacking. Here, we present Ancestry_HMM-S, a Hidden Markov Model based method for identifying genes undergoing adaptive introgression and quantifying the strength of selection acting on them. Through extensive validation, we show that this method performs well on moderately sized datasets for realistic population and selection parameters. We apply Ancestry_HMM-S to a dataset of an admixed Drosophila melanogaster population from South Africa and we identify 17 loci which show signatures of adaptive introgression, four of which have previously been shown to confer resistance to insecticides. Ancestry_HMM-S provides a powerful method for inferring adaptive introgression in datasets that are typically collected when studying admixed populations. This method will enable powerful insights into the genetic consequences of admixture across diverse populations. Ancestry_HMM-S can be downloaded from https://github.com/jesvedberg/Ancestry_HMM-S/.


Genetics ◽  
2017 ◽  
Vol 206 (1) ◽  
pp. 105-118 ◽  
Author(s):  
Matthew S. Ackerman ◽  
Parul Johri ◽  
Ken Spitze ◽  
Sen Xu ◽  
Thomas G. Doak ◽  
...  

2020 ◽  
Vol 107 (2) ◽  
pp. 175-182
Author(s):  
Simon Easteal ◽  
Ruth M. Arkell ◽  
Renzo F. Balboa ◽  
Shayne A. Bellingham ◽  
Alex D. Brown ◽  
...  

2011 ◽  
Vol 7 (6) ◽  
pp. e1002073 ◽  
Author(s):  
Nathan L. Nehrt ◽  
Wyatt T. Clark ◽  
Predrag Radivojac ◽  
Matthew W. Hahn

2018 ◽  
Vol 19 (4) ◽  
pp. 995-1005 ◽  
Author(s):  
Violeta Muñoz-Fuentes ◽  
◽  
Pilar Cacheiro ◽  
Terrence F. Meehan ◽  
Juan Antonio Aguilar-Pimentel ◽  
...  

AbstractThe International Mouse Phenotyping Consortium (IMPC) is building a catalogue of mammalian gene function by producing and phenotyping a knockout mouse line for every protein-coding gene. To date, the IMPC has generated and characterised 5186 mutant lines. One-third of the lines have been found to be non-viable and over 300 new mouse models of human disease have been identified thus far. While current bioinformatics efforts are focused on translating results to better understand human disease processes, IMPC data also aids understanding genetic function and processes in other species. Here we show, using gorilla genomic data, how genes essential to development in mice can be used to help assess the potentially deleterious impact of gene variants in other species. This type of analyses could be used to select optimal breeders in endangered species to maintain or increase fitness and avoid variants associated to impaired-health phenotypes or loss-of-function mutations in genes of critical importance. We also show, using selected examples from various mammal species, how IMPC data can aid in the identification of candidate genes for studying a condition of interest, deliver information about the mechanisms involved, or support predictions for the function of genes that may play a role in adaptation. With genotyping costs decreasing and the continued improvements of bioinformatics tools, the analyses we demonstrate can be routinely applied.


Sign in / Sign up

Export Citation Format

Share Document