Evolutionary conservation and disease gene association of the human genes composing pseudogenes

Gene ◽  
2012 ◽  
Vol 501 (2) ◽  
pp. 164-170 ◽  
Author(s):  
Kamalika Sen ◽  
Tapash Chandra Ghosh
2014 ◽  
Author(s):  
Sune Pletscher-Frankild ◽  
Albert Pallejà ◽  
Kalliopi Tsafou ◽  
Janos X Binder ◽  
Lars Juhl Jensen

Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease–gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease–gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a user-friendly web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download.


2020 ◽  
Author(s):  
Helena B. Cooper ◽  
Paul P. Gardner

AbstractProteins and non-coding RNAs are functional products of the genome that carry out the bulk of crucial cellular processes. With recent technological advances, researchers can sequence genomes in the thousands as well as probe for specific genomic activities of multiple species and conditions. These studies have identified thousands of potential proteins, RNAs and associated activities, however there are conflicting conclusions on the functional implications depending upon the burden of evidence researchers use, leading to diverse interpretations of which regions of the genome are “functional”. Here we investigate the association between gene functionality and genomic features, by comparing established functional protein-coding and non-coding genes to non-genic regions of the genome. We find that the strongest and most consistent association between functional genes and any genomic feature is evolutionary conservation and transcriptional activity. Other strongly associated features include sequence alignment statistics, such as maximum between-site covariation. We have also identified some concerns with 1,000 Genomes Project and Genome Aggregation Database SNP densities, as short non-coding RNAs tend to have greater than expected SNP densities. Our results demonstrate the importance of evolutionary conservation and transcription for sequence functionality, which should both be taken into consideration when differentiating between functional sequences and noise.


Author(s):  
Sezin Kircali Ata ◽  
Min Wu ◽  
Yuan Fang ◽  
Le Ou-Yang ◽  
Chee Keong Kwoh ◽  
...  

Abstract Disease–gene association through genome-wide association study (GWAS) is an arduous task for researchers. Investigating single nucleotide polymorphisms that correlate with specific diseases needs statistical analysis of associations. Considering the huge number of possible mutations, in addition to its high cost, another important drawback of GWAS analysis is the large number of false positives. Thus, researchers search for more evidence to cross-check their results through different sources. To provide the researchers with alternative and complementary low-cost disease–gene association evidence, computational approaches come into play. Since molecular networks are able to capture complex interplay among molecules in diseases, they become one of the most extensively used data for disease–gene association prediction. In this survey, we aim to provide a comprehensive and up-to-date review of network-based methods for disease gene prediction. We also conduct an empirical analysis on 14 state-of-the-art methods. To summarize, we first elucidate the task definition for disease gene prediction. Secondly, we categorize existing network-based efforts into network diffusion methods, traditional machine learning methods with handcrafted graph features and graph representation learning methods. Thirdly, an empirical analysis is conducted to evaluate the performance of the selected methods across seven diseases. We also provide distinguishing findings about the discussed methods based on our empirical analysis. Finally, we highlight potential research directions for future studies on disease gene prediction.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 160616-160626
Author(s):  
Misba Sikandar ◽  
Rafia Sohail ◽  
Yousaf Saeed ◽  
Asim Zeb ◽  
Mahdi Zareei ◽  
...  

Biosystems ◽  
2020 ◽  
Vol 193-194 ◽  
pp. 104133
Author(s):  
Tyler K. Collins ◽  
Sheridan Houghten

Sign in / Sign up

Export Citation Format

Share Document