scholarly journals Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network

2010 ◽  
Vol 26 (9) ◽  
pp. 1219-1224 ◽  
Author(s):  
Yongjin Li ◽  
Jagdish C. Patra

Abstract Motivation: Clinical diseases are characterized by distinct phenotypes. To identify disease genes is to elucidate the gene–phenotype relationships. Mutations in functionally related genes may result in similar phenotypes. It is reasonable to predict disease-causing genes by integrating phenotypic data and genomic data. Some genetic diseases are genetically or phenotypically similar. They may share the common pathogenetic mechanisms. Identifying the relationship between diseases will facilitate better understanding of the pathogenetic mechanism of diseases. Results: In this article, we constructed a heterogeneous network by connecting the gene network and phenotype network using the phenotype–gene relationship information from the OMIM database. We extended the random walk with restart algorithm to the heterogeneous network. The algorithm prioritizes the genes and phenotypes simultaneously. We use leave-one-out cross-validation to evaluate the ability of finding the gene–phenotype relationship. Results showed improved performance than previous works. We also used the algorithm to disclose hidden disease associations that cannot be found by gene network or phenotype network alone. We identified 18 hidden disease associations, most of which were supported by literature evidence. Availability: The MATLAB code of the program is available at http://www3.ntu.edu.sg/home/aspatra/research/Yongjin_BI2010.zip Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

2018 ◽  
Vol 19 (11) ◽  
pp. 3410 ◽  
Author(s):  
Xiujuan Lei ◽  
Zengqiang Fang ◽  
Luonan Chen ◽  
Fang-Xiang Wu

CircRNAs have particular biological structure and have proven to play important roles in diseases. It is time-consuming and costly to identify circRNA-disease associations by biological experiments. Therefore, it is appealing to develop computational methods for predicting circRNA-disease associations. In this study, we propose a new computational path weighted method for predicting circRNA-disease associations. Firstly, we calculate the functional similarity scores of diseases based on disease-related gene annotations and the semantic similarity scores of circRNAs based on circRNA-related gene ontology, respectively. To address missing similarity scores of diseases and circRNAs, we calculate the Gaussian Interaction Profile (GIP) kernel similarity scores for diseases and circRNAs, respectively, based on the circRNA-disease associations downloaded from circR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). Then, we integrate disease functional similarity scores and circRNA semantic similarity scores with their related GIP kernel similarity scores to construct a heterogeneous network made up of three sub-networks: disease similarity network, circRNA similarity network and circRNA-disease association network. Finally, we compute an association score for each circRNA-disease pair based on paths connecting them in the heterogeneous network to determine whether this circRNA-disease pair is associated. We adopt leave one out cross validation (LOOCV) and five-fold cross validations to evaluate the performance of our proposed method. In addition, three common diseases, Breast Cancer, Gastric Cancer and Colorectal Cancer, are used for case studies. Experimental results illustrate the reliability and usefulness of our computational method in terms of different validation measures, which indicates PWCDA can effectively predict potential circRNA-disease associations.


2010 ◽  
Vol 7 (2) ◽  
Author(s):  
Jeffrey Q. Jiang ◽  
Andreas W. M. Dress ◽  
Ming Chen

SummaryEmpirical clinical studies on the human interactome and phenome not only illustrates prevalent phenotypic overlap and genetic overlap between diseases, but also reveals a modular organization of the genetic landscape of human disease, provding new opportunities to reduce the complexity in dissecting the phenotype-genotype association. We here introduce a network-module based method towards phenotype-genotype association inference and disease gene identification. This approach incorporates protein-protein interaction network, phenotype similarity network and known phenotype-genotype associations into an assembled network. We then decomposes the resulted network into modules (or communities) wherein we identified and prioritized the disease genes from the candidates within the loci associated with the query disease using a linear regression model and concordance score. For the known phenotype-gene associations in the OMIM database, we used the leave-one-out validation to evaluate the feasibility of our method, and successfully ranked known disease genes at top 1 in 887 out of 1807 cases. Moreover, applying this approach on 850 OMIMloci characterized by an unknown molecular basis, we propose high-probability candidates for 81 genetic diseases.


2020 ◽  
Vol 36 (14) ◽  
pp. 4214-4216
Author(s):  
Mario Failli ◽  
Jussi Paananen ◽  
Vittorio Fortino

Abstract Summary Estimating efficacy of gene–target-disease associations is a fundamental step in drug discovery. An important data source for this laborious task is RNA expression, which can provide gene–disease associations on the basis of expression fold change and statistical significance. However, the simply use of the log-fold change can lead to numerous false-positive associations. On the other hand, more sophisticated methods that utilize gene co-expression networks do not consider tissue specificity. Here, we introduce Transcriptome-driven Efficacy estimates for gene-based TArget discovery (ThETA), an R package that enables non-expert users to use novel efficacy scoring methods for drug–target discovery. In particular, ThETA allows users to search for gene perturbation (therapeutics) that reverse disease-gene expression and genes that are closely related to disease-genes in tissue-specific networks. ThETA also provides functions to integrate efficacy evaluations obtained with different approaches and to build an overall efficacy score, which can be used to identify and prioritize gene(target)–disease associations. Finally, ThETA implements visualizations to show tissue-specific interconnections between target and disease-genes, and to indicate biological annotations associated with the top selected genes. Availability and implementation ThETA is freely available for academic use at https://github.com/vittoriofortino84/ThETA. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (22) ◽  
pp. 4730-4738 ◽  
Author(s):  
Yan Zhao ◽  
Xing Chen ◽  
Jun Yin

AbstractMotivationRecent studies have shown that microRNAs (miRNAs) play a critical part in several biological processes and dysregulation of miRNAs is related with numerous complex human diseases. Thus, in-depth research of miRNAs and their association with human diseases can help us to solve many problems.ResultsDue to the high cost of traditional experimental methods, revealing disease-related miRNAs through computational models is a more economical and efficient way. Considering the disadvantages of previous models, in this paper, we developed adaptive boosting for miRNA-disease association prediction (ABMDA) to predict potential associations between diseases and miRNAs. We balanced the positive and negative samples by performing random sampling based on k-means clustering on negative samples, whose process was quick and easy, and our model had higher efficiency and scalability for large datasets than previous methods. As a boosting technology, ABMDA was able to improve the accuracy of given learning algorithm by integrating weak classifiers that could score samples to form a strong classifier based on corresponding weights. Here, we used decision tree as our weak classifier. As a result, the area under the curve (AUC) of global and local leave-one-out cross validation reached 0.9170 and 0.8220, respectively. What is more, the mean and the standard deviation of AUCs achieved 0.9023 and 0.0016, respectively in 5-fold cross validation. Besides, in the case studies of three important human cancers, 49, 50 and 50 out of the top 50 predicted miRNAs for colon neoplasms, hepatocellular carcinoma and breast neoplasms were confirmed by the databases and experimental literatures.Availability and implementationThe code and dataset of ABMDA are freely available at https://github.com/githubcode007/ABMDA.Supplementary informationSupplementary data are available at Bioinformatics online.


Author(s):  
Pei-Quan WANG ◽  
Jing LI ◽  
Li-Li ZHANG ◽  
Hong-Chun LV ◽  
Su-Hua ZHANG

Background: The study aimed to detect critical metabolites in acute lung injury (ALI). Methods: A comparative analysis of microarray profile of patients with sepsis-induced ALI compared with sepsis patients with was conducted using bioinformatic tools through constructing multi-omics network. Multi-omics composite networks (gene network, metabolite network, phenotype network, gene-metabolite association network, phenotype-gene association network, and phenotype-metabolite association network) were constructed, following by integration of these composite networks to establish a heterogeneous network. Next, seed genes, and ALI phenotype were mapped into the heterogeneous network to further obtain a weighted composite network. Random walk with restart (RWR) was used for the weighted composite network to extract and prioritize the metabolites. On the basis of the distance proximity among metabolites, the top 50 metabolites with the highest proximity were identified, and the top 100 co-expressed genes interacted with the top 50 metabolites were also screened out. Results: Totally, there were 9363 nodes and 10,226,148 edges in the integrated composite network. There were 4 metabolites with the scores > 0.009, including CHITIN, Tretinoin, sodium ion, and Celebrex. Adenosine 5'-diphosphate, triphosadenine, and tretinoin had higher degrees in the composite network and the co-expressed network. Conclusion: Adenosine 5'-diphosphate, triphosadenine, and tretinoin may be potential biomarkers for diagnosis and treatment of ALI


Cells ◽  
2019 ◽  
Vol 8 (9) ◽  
pp. 1012 ◽  
Author(s):  
Xuan ◽  
Pan ◽  
Zhang ◽  
Liu ◽  
Sun

Aberrant expressions of long non-coding RNAs (lncRNAs) are often associated with diseases and identification of disease-related lncRNAs is helpful for elucidating complex pathogenesis. Recent methods for predicting associations between lncRNAs and diseases integrate their pertinent heterogeneous data. However, they failed to deeply integrate topological information of heterogeneous network comprising lncRNAs, diseases, and miRNAs. We proposed a novel method based on the graph convolutional network and convolutional neural network, referred to as GCNLDA, to infer disease-related lncRNA candidates. The heterogeneous network containing the lncRNA, disease, and miRNA nodes, is constructed firstly. The embedding matrix of a lncRNA-disease node pair was constructed according to various biological premises about lncRNAs, diseases, and miRNAs. A new framework based on a graph convolutional network and a convolutional neural network was developed to learn network and local representations of the lncRNA-disease pair. On the left side of the framework, the autoencoder based on graph convolution deeply integrated topological information within the heterogeneous lncRNA-disease-miRNA network. Moreover, as different node features have discriminative contributions to the association prediction, an attention mechanism at node feature level is constructed. The left side learnt the network representation of the lncRNA-disease pair. The convolutional neural networks on the right side of the framework learnt the local representation of the lncRNA-disease pair by focusing on the similarities, associations, and interactions that are only related to the pair. Compared to several state-of-the-art prediction methods, GCNLDA had superior performance. Case studies on stomach cancer, osteosarcoma, and lung cancer confirmed that GCNLDA effectively discovers the potential lncRNA-disease associations.


2011 ◽  
Vol 19 (04) ◽  
pp. 607-616
Author(s):  
YUANYUAN ZHANG ◽  
SHUDONG WANG ◽  
MEIXI YANG ◽  
DASHUN XU ◽  
DAZHI MENG

With the rapid growth of microarray data, it has become a hot topic to reveal complex behaviors and functions of life system by studying the relationships among genes. In the process of reverse network modeling, the relationships with less relevance are generally not considered by determining a threshold when the relationships among genes are mined. However, there are no effective methods to determine the threshold up to now. It is worthwhile to note that the phenotypes of genetic diseases are generally regarded as external representation of the functions of genes. Therefore, two types of phenotype networks are constructed from gene and disease views, respectively, and through comparing these two types of phenotype networks, the threshold of gene network corresponding to a certain disease can be determined when their similarity reaches to maximum. Because the gene network is determined based on the relationships among phenotypes and phenotypes are external representation of the functions of genes, it is considered that relationships in the gene network may show functional relationships among genes in biological system. In this work, the thresholds 0.47 and 0.48 of gene network are determined based on Parkinson disease phenotypes. Furthermore, the validity of these thresholds is verified by the specificity and susceptibility of phenotype networks. Also, through comparing the structural parameters of gene networks for normal and disease stage at different thresholds, significant difference between the two gene networks at threshold 0.47 or 0.48 is found. The significant difference of structural parameters further verifies the efficiency of this method.


Genes ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 1713
Author(s):  
Manuela Petti ◽  
Lorenzo Farina ◽  
Federico Francone ◽  
Stefano Lucidi ◽  
Amalia Macali ◽  
...  

Disease gene prediction is to date one of the main computational challenges of precision medicine. It is still uncertain if disease genes have unique functional properties that distinguish them from other non-disease genes or, from a network perspective, if they are located randomly in the interactome or show specific patterns in the network topology. In this study, we propose a new method for disease gene prediction based on the use of biological knowledge-bases (gene-disease associations, genes functional annotations, etc.) and interactome network topology. The proposed algorithm called MOSES is based on the definition of two somewhat opposing sets of genes both disease-specific from different perspectives: warm seeds (i.e., disease genes obtained from databases) and cold seeds (genes far from the disease genes on the interactome and not involved in their biological functions). The application of MOSES to a set of 40 diseases showed that the suggested putative disease genes are significantly enriched in their reference disease. Reassuringly, known and predicted disease genes together, tend to form a connected network module on the human interactome, mitigating the scattered distribution of disease genes which is probably due to both the paucity of disease-gene associations and the incompleteness of the interactome.


Sign in / Sign up

Export Citation Format

Share Document