gene prioritization
Recently Published Documents


TOTAL DOCUMENTS

179
(FIVE YEARS 41)

H-INDEX

25
(FIVE YEARS 4)

2021 ◽  
Vol 22 (S9) ◽  
Author(s):  
Yan Wang ◽  
Zuheng Xia ◽  
Jingjing Deng ◽  
Xianghua Xie ◽  
Maoguo Gong ◽  
...  

Abstract Background Gene prioritization (gene ranking) aims to obtain the centrality of genes, which is critical for cancer diagnosis and therapy since keys genes correspond to the biomarkers or targets of drugs. Great efforts have been devoted to the gene ranking problem by exploring the similarity between candidate and known disease-causing genes. However, when the number of disease-causing genes is limited, they are not applicable largely due to the low accuracy. Actually, the number of disease-causing genes for cancers, particularly for these rare cancers, are really limited. Therefore, there is a critical needed to design effective and efficient algorithms for gene ranking with limited prior disease-causing genes. Results In this study, we propose a transfer learning based algorithm for gene prioritization (called TLGP) in the cancer (target domain) without disease-causing genes by transferring knowledge from other cancers (source domain). The underlying assumption is that knowledge shared by similar cancers improves the accuracy of gene prioritization. Specifically, TLGP first quantifies the similarity between the target and source domain by calculating the affinity matrix for genes. Then, TLGP automatically learns a fusion network for the target cancer by fusing affinity matrix, pathogenic genes and genomic data of source cancers. Finally, genes in the target cancer are prioritized. The experimental results indicate that the learnt fusion network is more reliable than gene co-expression network, implying that transferring knowledge from other cancers improves the accuracy of network construction. Moreover, TLGP outperforms state-of-the-art approaches in terms of accuracy, improving at least 5%. Conclusion The proposed model and method provide an effective and efficient strategy for gene ranking by integrating genomic data from various cancers.


2021 ◽  
Vol 3 (3) ◽  
Author(s):  
Chengyao Peng ◽  
Simon Dieck ◽  
Alexander Schmid ◽  
Ashar Ahmad ◽  
Alexej Knaus ◽  
...  

Abstract Many rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the Human Phenotype Ontology (HPO), which are increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed; however, the performance of many such tools suffers from an over-representation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With Cada, we built a knowledge graph based on both case annotations and disorder annotations. Using network representation learning, we achieve gene prioritization by link prediction. Our results suggest that Cada exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Additionally, information about the frequency of occurrence of a feature can readily be incorporated, when available. Crucial in the design of our approach is the use of the growing amount of phenotype–genotype information that diagnostic labs deposit in databases such as ClinVar. By this means, Cada is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (4) ◽  
pp. e1009464
Author(s):  
Binglan Li ◽  
Yogasudha Veturi ◽  
Anurag Verma ◽  
Yuki Bradford ◽  
Eric S. Daar ◽  
...  

As a type of relatively new methodology, the transcriptome-wide association study (TWAS) has gained interest due to capacity for gene-level association testing. However, the development of TWAS has outpaced statistical evaluation of TWAS gene prioritization performance. Current TWAS methods vary in underlying biological assumptions about tissue specificity of transcriptional regulatory mechanisms. In a previous study from our group, this may have affected whether TWAS methods better identified associations in single tissues versus multiple tissues. We therefore designed simulation analyses to examine how the interplay between particular TWAS methods and tissue specificity of gene expression affects power and type I error rates for gene prioritization. We found that cross-tissue identification of expression quantitative trait loci (eQTLs) improved TWAS power. Single-tissue TWAS (i.e., PrediXcan) had robust power to identify genes expressed in single tissues, but, often found significant associations in the wrong tissues as well (therefore had high false positive rates). Cross-tissue TWAS (i.e., UTMOST) had overall equal or greater power and controlled type I error rates for genes expressed in multiple tissues. Based on these simulation results, we applied a tissue specificity-aware TWAS (TSA-TWAS) analytic framework to look for gene-based associations with pre-treatment laboratory values from AIDS Clinical Trial Group (ACTG) studies. We replicated several proof-of-concept transcriptionally regulated gene-trait associations, including UGT1A1 (encoding bilirubin uridine diphosphate glucuronosyltransferase enzyme) and total bilirubin levels (p = 3.59×10−12), and CETP (cholesteryl ester transfer protein) with high-density lipoprotein cholesterol (p = 4.49×10−12). We also identified several novel genes associated with metabolic and virologic traits, as well as pleiotropic genes that linked plasma viral load, absolute basophil count, and/or triglyceride levels. By highlighting the advantages of different TWAS methods, our simulation study promotes a tissue specificity-aware TWAS analytic framework that revealed novel aspects of HIV-related traits.


2021 ◽  
Author(s):  
Chengyao Peng ◽  
Simon Dieck ◽  
Alexander Schmid ◽  
Ashar Ahmad ◽  
Alexej Knaus ◽  
...  

AbstractMany rare syndromes can be well described and delineated from other disorders by a combination of characteristic symptoms. These phenotypic features are best documented with terms of the human phenotype ontology (HPO), which is increasingly used in electronic health records (EHRs), too. Many algorithms that perform HPO-based gene prioritization have also been developed, however, the performance of many such tools suffers from an overrepresentation of atypical cases in the medical literature. This is certainly the case if the algorithm cannot handle features that occur with reduced frequency in a disorder. With CADA we built a knowledge-graph that is based on case annotations and disorder annotations and show that CADA exhibits superior performance particularly for patients that present with the pathognomonic findings of a disease. Crucial in the design of our approach is the use of the growing amount of phenotypic information that diagnostic labs deposit in databases such as ClinVar. By this means CADA is an ideal reference tool for differential diagnostics in rare disorders that can also be updated regularly.


Author(s):  
Jesús Jaime Solano Noriega ◽  
Juan Carlos Leyva López ◽  
Fiona Browne ◽  
Jun Liu
Keyword(s):  

2020 ◽  
Author(s):  
Kyoko Watanabe ◽  
Philip R. Jansen ◽  
Jeanne E. Savage ◽  
Priyanka Nandakumar ◽  
Xin Wang ◽  
...  

AbstractInsomnia is a heritable, highly prevalent sleep disorder, for which no sufficient treatment currently exists. Previous genome-wide association studies (GWASs) with up to 1.3 million subjects identified over 200 associated loci. This extreme polygenicity suggested many more loci to be discovered. The current study almost doubled the sample size to over 2.3 million individuals thereby increasing statistical power. We identified 554 risk loci (confirming 190 previously associated loci and detecting 364 novel), and capitalizing on this large number of loci, we propose a novel strategy to prioritize genes using external biological resources and information on functional interactions between genes across risk loci. Of all 3,898 genes naively implicated from the risk loci, we prioritize 289. For these, we find brain-tissue expression specificity and enrichment in specific gene-sets of synaptic signaling functions and neuronal differentiation. We show that the novel gene prioritization strategy yields specific hypotheses on causal mechanisms underlying insomnia, which would not fully have been detected using traditional approaches.


Author(s):  
Luigi Donato ◽  
Simona Alibrandi ◽  
Rosalia D’Angelo ◽  
Concetta Scimone ◽  
Antonina Sidoti ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document