scholarly journals Machine learning-based predictions of dietary restriction associations across ageing-related genes

2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Gustavo Daniel Vega Magdaleno ◽  
Vladislav Bespalov ◽  
Yalin Zheng ◽  
Alex A. Freitas ◽  
Joao Pedro de Magalhaes

Abstract Background Dietary restriction (DR) is the most studied pro-longevity intervention; however, a complete understanding of its underlying mechanisms remains elusive, and new research directions may emerge from the identification of novel DR-related genes and DR-related genetic features. Results This work used a Machine Learning (ML) approach to classify ageing-related genes as DR-related or NotDR-related using 9 different types of predictive features: PathDIP pathways, two types of features based on KEGG pathways, two types of Protein–Protein Interactions (PPI) features, Gene Ontology (GO) terms, Genotype Tissue Expression (GTEx) expression features, GeneFriends co-expression features and protein sequence descriptors. Our findings suggested that features biased towards curated knowledge (i.e. GO terms and biological pathways), had the greatest predictive power, while unbiased features (mainly gene expression and co-expression data) have the least predictive power. Moreover, a combination of all the feature types diminished the predictive power compared to predictions based on curated knowledge. Feature importance analysis on the two most predictive classifiers mostly corroborated existing knowledge and supported recent findings linking DR to the Nuclear Factor Erythroid 2-Related Factor 2 (NRF2) signalling pathway and G protein-coupled receptors (GPCR). We then used the two strongest combinations of feature type and ML algorithm to predict DR-relatedness among ageing-related genes currently lacking DR-related annotations in the data, resulting in a set of promising candidate DR-related genes (GOT2, GOT1, TSC1, CTH, GCLM, IRS2 and SESN2) whose predicted DR-relatedness remain to be validated in future wet-lab experiments. Conclusions This work demonstrated the strong potential of ML-based techniques to identify DR-associated features as our findings are consistent with literature and recent discoveries. Although the inference of new DR-related mechanistic findings based solely on GO terms and biological pathways was limited due to their knowledge-driven nature, the predictive power of these two features types remained useful as it allowed inferring new promising candidate DR-related genes.

2021 ◽  
Author(s):  
Gustavo Daniel Vega-Magdaleno ◽  
Vladislav Bespalov ◽  
Yalin Zheng ◽  
Alex Freitas ◽  
Joao Pedro de Magalhaes

Caloric restriction (CR) is the most studied pro-longevity intervention; however, a complete understanding of its underlying mechanisms remains elusive, and new research directions may emerge from the identification of novel CR-related genes and CR-related genetic features. This work used a Machine Learning (ML) approach to classify ageing-related genes as CR-related or NotCR-related using 9 different types of predictive features: PathDIP pathways, two types of features based on KEGG pathways, two types of Protein-Protein Interactions (PPI) features, Gene Ontology (GO) terms, Genotype-Tissue Expression (GTEx) expression features, GeneFriends co-expression features and protein sequence descriptors. Our findings suggested that features biased towards curated knowledge (i.e. GO terms and biological pathways) have the greatest predictive power while unbiased features (mainly gene expression and co-expression data) have the least predictive power. Moreover, a combination of all the feature types diminished the predictive power compared to predictions based on curated knowledge. Feature importance analysis on the two most predictive classifiers mostly corroborated existing knowledge and supported recent findings linking CR to the Nuclear Factor Erythroid 2-Related Factor 2 (NRF2) signalling pathway and G protein-coupled receptors (GPCR). We then used the two strongest combinations of feature type and ML algorithm to predict CR-relatedness among ageing-related genes currently lacking CR-related annotations in the data, resulting in a set of promising candidate CR-related genes (GOT2, GOT1, TSC1, CTH, GCLM, IRS2 and SESN2) whose predicted CR-relatedness remain to be validated in future wet-lab experiments.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Fei Yuan ◽  
Xiaoyong Pan ◽  
Lei Chen ◽  
Yu-Hang Zhang ◽  
Tao Huang ◽  
...  

Protein–protein interaction (PPI) plays an extremely remarkable role in the growth, reproduction, and metabolism of all lives. A thorough investigation of PPI can uncover the mechanism of how proteins express their functions. In this study, we used gene ontology (GO) terms and biological pathways to study an extended version of PPI (protein–protein functional associations) and subsequently identify some essential GO terms and pathways that can indicate the difference between two proteins with and without functional associations. The protein–protein functional associations validated by experiments were retrieved from STRING, a well-known database on collected associations between proteins from multiple sources, and they were termed as positive samples. The negative samples were constructed by randomly pairing two proteins. Each sample was represented by several features based on GO and KEGG pathway information of two proteins. Then, the mutual information was adopted to evaluate the importance of all features and some important ones could be accessed, from which a number of essential GO terms or KEGG pathways were identified. The final analysis of some important GO terms and one KEGG pathway can partly uncover the difference between proteins with and without functional associations.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Meisam Ghasedi ◽  
Maryam Sarfjoo ◽  
Iraj Bargegol

AbstractThe purpose of this study is to investigate and determine the factors affecting vehicle and pedestrian accidents taking place in the busiest suburban highway of Guilan Province located in the north of Iran and provide the most accurate prediction model. Therefore, the effective principal variables and the probability of occurrence of each category of crashes are analyzed and computed utilizing the factor analysis, logit, and Machine Learning approaches simultaneously. This method not only could contribute to achieving the most comprehensive and efficient model to specify the major contributing factor, but also it can provide officials with suggestions to take effective measures with higher precision to lessen accident impacts and improve road safety. Both the factor analysis and logit model show the significant roles of exceeding lawful speed, rainy weather and driver age (30–50) variables in the severity of vehicle accidents. On the other hand, the rainy weather and lighting condition variables as the most contributing factors in pedestrian accidents severity, underline the dominant role of environmental factors in the severity of all vehicle-pedestrian accidents. Moreover, considering both utilized methods, the machine-learning model has higher predictive power in all cases, especially in pedestrian accidents, with 41.6% increase in the predictive power of fatal accidents and 12.4% in whole accidents. Thus, the Artificial Neural Network model is chosen as the superior approach in predicting the number and severity of crashes. Besides, the good performance and validation of the machine learning is proved through performance and sensitivity analysis.


Author(s):  
Chen-Chih Chung ◽  
Oluwaseun Adebayo Bamodu ◽  
Chien-Tai Hong ◽  
Lung Chan ◽  
Hung-Wen Chiu

2020 ◽  
Vol 48 (10) ◽  
pp. 030006052095880
Author(s):  
Jianping Wu ◽  
Sulai Liu ◽  
Xiaoming Chen ◽  
Hongfei Xu ◽  
Yaoping Tang

Objective Colorectal cancer (CRC) is the most common cancer worldwide. Patient outcomes following recurrence of CRC are very poor. Therefore, identifying the risk of CRC recurrence at an early stage would improve patient care. Accumulating evidence shows that autophagy plays an active role in tumorigenesis, recurrence, and metastasis. Methods We used machine learning algorithms and two regression models, univariable Cox proportion and least absolute shrinkage and selection operator (LASSO), to identify 26 autophagy-related genes (ARGs) related to CRC recurrence. Results By functional annotation, these ARGs were shown to be enriched in necroptosis and apoptosis pathways. Protein–protein interactions identified SQSTM1, CASP8, HSP80AB1, FADD, and MAPK9 as core genes in CRC autophagy. Of 26 ARGs, BAX and PARP1 were regarded as having the most significant predictive ability of CRC recurrence, with prediction accuracy of 71.1%. Conclusion These results shed light on prediction of CRC recurrence by ARGs. Stratification of patients into recurrence risk groups by testing ARGs would be a valuable tool for early detection of CRC recurrence.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Hudson Fernandes Golino ◽  
Liliany Souza de Brito Amaral ◽  
Stenio Fernando Pimentel Duarte ◽  
Cristiano Mauro Assis Gomes ◽  
Telma de Jesus Soares ◽  
...  

The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudoR2(.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudoR2(.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.


2017 ◽  
Vol 79 (02) ◽  
pp. 123-130 ◽  
Author(s):  
Whitney Muhlestein ◽  
Dallin Akagi ◽  
Justiss Kallos ◽  
Peter Morone ◽  
Kyle Weaver ◽  
...  

Objective Machine learning (ML) algorithms are powerful tools for predicting patient outcomes. This study pilots a novel approach to algorithm selection and model creation using prediction of discharge disposition following meningioma resection as a proof of concept. Materials and Methods A diversity of ML algorithms were trained on a single-institution database of meningioma patients to predict discharge disposition. Algorithms were ranked by predictive power and top performers were combined to create an ensemble model. The final ensemble was internally validated on never-before-seen data to demonstrate generalizability. The predictive power of the ensemble was compared with a logistic regression. Further analyses were performed to identify how important variables impact the ensemble. Results Our ensemble model predicted disposition significantly better than a logistic regression (area under the curve of 0.78 and 0.71, respectively, p = 0.01). Tumor size, presentation at the emergency department, body mass index, convexity location, and preoperative motor deficit most strongly influence the model, though the independent impact of individual variables is nuanced. Conclusion Using a novel ML technique, we built a guided ML ensemble model that predicts discharge destination following meningioma resection with greater predictive power than a logistic regression, and that provides greater clinical insight than a univariate analysis. These techniques can be extended to predict many other patient outcomes of interest.


2010 ◽  
Vol 9 ◽  
pp. CIN.S6315 ◽  
Author(s):  
Xuesong Han ◽  
Yang Li ◽  
Jian Huang ◽  
Yawei Zhang ◽  
Theodore Holford ◽  
...  

Despite decades of intensive research, NHL (non-Hodgkin lymphoma) still remains poorly understood and is largely incurable. Recent molecular studies suggest that genomic variants measured with SNPs (single nucleotide polymorphisms) in genes may have additional predictive power for NHL prognosis beyond clinical risk factors. We analyzed a genetic association study. The prognostic cohort consisted of 346 patients, among whom 138 had DLBCL (diffuse large B-cell lymphoma) and 101 had FL (follicular lymphoma). For DLBCL, we analyzed 1229 SNPs which represented 122 KEGG pathways. For FL, we analyzed 1228 SNPs which represented 122 KEGG pathways. Unlike in existing studies, we targeted at identifying pathways with significant additional predictive power beyond clinical factors. In addition, we accounted for the joint effects of multiple SNPs within pathways, whereas some existing studies drew pathway-level conclusions based on separate analysis of individual SNPs. For DLBCL, we identified four pathways, which, combined with the clinical factors, had medians of the prediction logrank statistics as 2.535, 2.220, 2.094, 2.453, and 2.512, respectively. As a comparison, the clinical factors had a median of the prediction logrank statistics around 0.552. For FL, we identified two pathways, which, combined with the clinical factors, had medians of the prediction logrank statistics as 4.320 and 3.532, respectively. As a comparison, the clinical factors had a median of the prediction logrank statistics around 1.212. For NHL overall, we identified three pathways, which, combined with the clinical factors, had medians of the prediction logrank statistics as 5.722, 5.314, and 5.441, respective. As a comparison, the clinical factors had a median of the prediction logrank statistics around 4.411. The identified pathways have sound biological bases. In addition, they are different from those identified using existing approaches. They may provide further insights into the biological mechanisms underlying the prognosis of NHL.


Sign in / Sign up

Export Citation Format

Share Document