scholarly journals HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5298 ◽  
Author(s):  
Tunca Doğan

Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provides researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein—ontology term—disease relations. As an application of the proposed approach, HPO term—protein associations (i.e., HPO2protein) were predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO performance was among the best (Fmax = 0.35). The automated cross ontology mapping approach developed in this work may be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.

2018 ◽  
Author(s):  
Tunca Doğan

Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provides researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein - ontology term - disease relations. As an application of the proposed approach, HPO term – protein associations (i.e., HPO2protein) were predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO performance was among the best (Fmax = 0.35). The automated cross ontology mapping approach developed in this work may be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.


2018 ◽  
Author(s):  
Tunca Doğan

Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provides researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein - ontology term - disease relations. As an application of the proposed approach, HPO term – protein associations (i.e., HPO2protein) were predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO performance was among the best (Fmax = 0.35). The automated cross ontology mapping approach developed in this work may be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.


2018 ◽  
Author(s):  
Tunca Doğan

Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provided researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein - ontology term - disease relations. As an application of the proposed approach, HPO term - protein associations (i.e., HPO2protein) are predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO beat all models from 38 participating groups (with Fmax=0.402), by a margin of 12.6% compared to the top performer. It is important to note that, HPO2GO was not proposed to replace, but to complement the conventional approaches used in the field of biomedical relation discovery. The automated cross ontology mapping approach developed in this work can easily be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The proposed approach will be more effective when combined with powerful techniques such as text/literature mining. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.


Author(s):  
David Lewis-Smith ◽  
Shiva Ganesan ◽  
Peter D. Galer ◽  
Katherine L. Helbig ◽  
Sarah E. McKeown ◽  
...  

AbstractWhile genetic studies of epilepsies can be performed in thousands of individuals, phenotyping remains a manual, non-scalable task. A particular challenge is capturing the evolution of complex phenotypes with age. Here, we present a novel approach, applying phenotypic similarity analysis to a total of 3251 patient-years of longitudinal electronic medical record data from a previously reported cohort of 658 individuals with genetic epilepsies. After mapping clinical data to the Human Phenotype Ontology, we determined the phenotypic similarity of individuals sharing each genetic etiology within each 3-month age interval from birth up to a maximum age of 25 years. 140 of 600 (23%) of all 27 genes and 3-month age intervals with sufficient data for calculation of phenotypic similarity were significantly higher than expect by chance. 11 of 27 genetic etiologies had significant overall phenotypic similarity trajectories. These do not simply reflect strong statistical associations with single phenotypic features but appear to emerge from complex clinical constellations of features that may not be strongly associated individually. As an attempt to reconstruct the cognitive framework of syndrome recognition in clinical practice, longitudinal phenotypic similarity analysis extends the traditional phenotyping approach by utilizing data from electronic medical records at a scale that is far beyond the capabilities of manual phenotyping. Delineation of how the phenotypic homogeneity of genetic epilepsies varies with age could improve the phenotypic classification of these disorders, the accuracy of prognostic counseling, and by providing historical control data, the design and interpretation of precision clinical trials in rare diseases.


2021 ◽  
Author(s):  
Xiangnan Xu ◽  
Michal Lubomski ◽  
Andrew J Holmes ◽  
Carolyn M Sue ◽  
Ryan L Davis ◽  
...  

The microbiome plays a fundamental role in human health and diet is one of the strongest modulators of the gut microbiome. However, interactions between microbiota and host health are complex and diverse. Understanding the interplay between diet, the microbiome and health state could enable the design of personalized intervention strategies and improve the health and wellbeing of affected individuals. A common approach to this is to divide the study population into smaller cohorts based on dietary preferences in the hope of identifying specific microbial signatures. However, classification of patients based solely on diet is unlikely to reflect the microbiome-host health relationship or the taxonomic microbiome makeup. To this end, we present a novel approach, the Nutrition-Ecotype Mixture of Experts (NEMoE) model, for establishing associations between gut microbiota and health state that accounts for diet-specific cohort variability using a regularized mixture of experts model framework with an integrated parameter sharing strategy to ensure data driven diet-cohort identification consistency across taxonomic levels. The success of our approach was demonstrated through a series of simulation studies, in which NEMoE showed robustness with regard to parameter selection and varying degrees of data heterogeneity. Further application to real-world microbiome data from a Parkinson's disease cohort revealed that NEMoE is capable of not only improving predictive performance for Parkinson's Disease but also for identifying diet-specific microbiome markers of disease. Our results indicate that NEMoE can be used to uncover diet-specific relationships between nutritional-ecotype and patient health and to contextualize precision nutrition for different diseases.


2018 ◽  
Vol 50 (6) ◽  
pp. 440-447 ◽  
Author(s):  
Louise C. Evans ◽  
Alex Dayton ◽  
Chun Yang ◽  
Pengyuan Liu ◽  
Theresa Kurth ◽  
...  

Studies exploring the development of hypertension have traditionally been unable to distinguish which of the observed changes are underlying causes from those that are a consequence of elevated blood pressure. In this study, a custom-designed servo-control system was utilized to precisely control renal perfusion pressure to the left kidney continuously during the development of hypertension in Dahl salt-sensitive rats. In this way, we maintained the left kidney at control blood pressure while the right kidney was exposed to hypertensive pressures. As each kidney was exposed to the same circulating factors, differences between them represent changes induced by pressure alone. RNA sequencing analysis identified 1,613 differently expressed genes affected by renal perfusion pressure. Three pathway analysis methods were applied, one a novel approach incorporating arterial pressure as an input variable allowing a more direct connection between the expression of genes and pressure. The statistical analysis proposed several novel pathways by which pressure affects renal physiology. We confirmed the effects of pressure on p-Jnk regulation, in which the hypertensive medullas show increased p-Jnk/Jnk ratios relative to the left (0.79 ± 0.11 vs. 0.53 ± 0.10, P < 0.01, n = 8). We also confirmed pathway predictions of mitochondrial function, in which the respiratory control ratio of hypertensive vs. control mitochondria are significantly reduced (7.9 ± 1.2 vs. 10.4 ± 1.8, P < 0.01, n = 6) and metabolomic profile, in which 14 metabolites differed significantly between hypertensive and control medullas ( P < 0.05, n = 5). These findings demonstrate that subtle differences in the transcriptome can be used to predict functional changes of the kidney as a consequence of pressure elevation.


Sensors ◽  
2019 ◽  
Vol 19 (21) ◽  
pp. 4610 ◽  
Author(s):  
Adolfo Molada-Tebar ◽  
Gabriel Riutort-Mayol ◽  
Ángel Marqués-Mateu ◽  
José Luis Lerma

In this paper, we propose a novel approach to undertake the colorimetric camera characterization procedure based on a Gaussian process (GP). GPs are powerful and flexible nonparametric models for multivariate nonlinear functions. To validate the GP model, we compare the results achieved with a second-order polynomial model, which is the most widely used regression model for characterization purposes. We applied the methodology on a set of raw images of rock art scenes collected with two different Single Lens Reflex (SLR) cameras. A leave-one-out cross-validation (LOOCV) procedure was used to assess the predictive performance of the models in terms of CIE XYZ residuals and Δ E a b * color differences. Values of less than 3 CIELAB units were achieved for Δ E a b * . The output sRGB characterized images show that both regression models are suitable for practical applications in cultural heritage documentation. However, the results show that colorimetric characterization based on the Gaussian process provides significantly better results, with lower values for residuals and Δ E a b * . We also analyzed the induced noise into the output image after applying the camera characterization. As the noise depends on the specific camera, proper camera selection is essential for the photogrammetric work.


2021 ◽  
pp. 00876-2020
Author(s):  
Mathew Suji Eapen ◽  
Wenying Lu ◽  
Tillie L. Hackett ◽  
Gurpreet Kaur Singhera ◽  
Malik Q. Mahmood ◽  
...  

IntroductionPrevious reports showed epithelial mesenchymal transition (EMT) as an active process that contributes to small airway (SA) fibrotic pathology. Myofibroblasts are highly active pro-fibrotic cells that secrete excessive and altered extracellular matrix (ECM). Here we relate SA myofibroblast presence with airway remodelling, physiology and EMT activity in smokers and COPD patients.MethodsLung resections from non-smoker controls (NC), normal lung function smokers (NLFS), COPD current (CS) and ex-smokers (ES) were stained with anti-human αSMA, collagen 1, and fibronectin. αSMA+ive cells were computed in reticular basement membrane (Rbm), lamina propria (LP), and adventitia and presented per mm of Rbm and mm2 of LP. Collagen-1 and fibronectin are presented as a percentage change from normal. All analysis including airway thickness were measured using Image-pro-plus 7.0.ResultsWe found an increase in sub-epithelial LP (especially) and adventitia thickness in all pathological groups compared to NC. Increases in αSMA+ive myofibroblasts were observed in sub-epithelial Rbm, LP, and adventitia in both the smoker and COPD groups compared to NCs. Further, the increase in the myofibroblast population in the LP was strongly associated with decrease in lung function, LP thickening, increase in ECM protein deposition, and finally EMT activity in epithelial cells.ConclusionsThis is the first systematic characterisation of small airway myofibroblasts in COPD based on their localisation, with statistically significant correlations between them and other pan-airway structural, lung function, and ECM protein changes. Finally, we suggest that EMT may be involved in such changes.


Author(s):  
NIKOS KATZOURIS ◽  
GEORGIOS PALIOURAS ◽  
ALEXANDER ARTIKIS

Abstract Complex Event Recognition (CER) systems detect event occurrences in streaming time-stamped input using predefined event patterns. Logic-based approaches are of special interest in CER, since, via Statistical Relational AI, they combine uncertainty-resilient reasoning with time and change, with machine learning, thus alleviating the cost of manual event pattern authoring. We present a system based on Answer Set Programming (ASP), capable of probabilistic reasoning with complex event patterns in the form of weighted rules in the Event Calculus, whose structure and weights are learnt online. We compare our ASP-based implementation with a Markov Logic-based one and with a number of state-of-the-art batch learning algorithms on CER data sets for activity recognition, maritime surveillance and fleet management. Our results demonstrate the superiority of our novel approach, both in terms of efficiency and predictive performance. This paper is under consideration for publication in Theory and Practice of Logic Programming (TPLP).


Sign in / Sign up

Export Citation Format

Share Document