scholarly journals A machine learning classifier trained on cancer transcriptomes detects NF1 inactivation signal in glioblastoma

2016 ◽  
Author(s):  
Gregory P. Way ◽  
Robert J. Allaway ◽  
Stephanie J. Bouley ◽  
Camilo E. Fadul ◽  
Yolanda Sanchez ◽  
...  

ABSTRACTBackground:We have identified molecules that exhibit synthetic lethality in cells with loss of the neurofibromin 1 (NF1) tumor suppressor gene. However, recognizing tumors that have inactivation of theNF1tumor suppressor function is challenging because the loss may occur via mechanisms that do not involve mutation of the genomic locus. Degradation of the NF1 protein, independent ofNF1mutation status, photocopies inactivating mutations to drive tumors in human glioma cell lines. NF1 inactivation may alter the transcriptional landscape of a tumor and allow a machine learning classifier to detect which tumors will benefit from synthetic lethal molecules.Results:We developed a strategy to predict tumors with low NF1 activity and hence tumors that may respond to treatments that target cells lacking NF1. Using RNAseq data from The Cancer Genome Atlas (TCGA), we trained an ensemble of 500 logistic regression classifiers that integrates mutation status with whole transcriptomes to predict NF1 inactivation in glioblastoma (GBM). On TCGA data, the classifier detectedNF1mutated tumors (test set area under the receiver operating characteristic curve (AUROC) mean = 0.77, 95% quantile = 0.53 – 0.95) over 50 random initializations. On RNA-Seq data transformed into the space of gene expression microarrays, this method produced a classifier with similar performance (test set AUROC mean = 0.77, 95% quantile = 0.53 – 0.96). We applied our ensemble classifier trained on the transformed TCGA data to a microarray validation set of 12 samples with matched RNA and NF1 protein-level measurements. The classifier’s NF1 score was associated with NF1 protein concentration in these samples.Conclusions:We demonstrate that TCGA can be used to train accurate predictors of NF1 inactivation in GBM. The ensemble classifier performed well for samples with very high or very low NF1 protein concentrations but had mixed performance in samples with intermediate NF1 concentrations. Nevertheless, high-performing and validated predictors have the potential to be paired with targeted therapies and personalized medicine.

Author(s):  
Dingchen Li ◽  
Yaru Wang ◽  
Wenjuan Hu ◽  
Fangyan Chen ◽  
Jingya Zhao ◽  
...  

Candida auris (C. auris) is an emerging fungus associated with high morbidity. It has a unique transmission ability and is often resistant to multiple drugs. In this study, we evaluated the ability of different machine learning models to classify the drug resistance and predicted and ranked the drug resistance mutations of C. auris. Two C. auris strains were obtained. Combined with other 356 strains collected from the European Bioinformatics Institute (EBI) databases, the whole genome sequencing (WGS) data were analyzed by bioinformatics. Machine learning classifiers were used to build drug resistance models, which were evaluated and compared by various evaluation methods based on AUC value. Briefly, two strains were assigned to Clade III in the phylogenetic tree, which was consistent with previous studies; nevertheless, the phylogenetic tree was not completely consistent with the conclusion of clustering according to the geographical location discovered earlier. The clustering results of C. auris were related to its drug resistance. The resistance genes of C. auris were not under additional strong selection pressure, and the performance of different models varied greatly for different drugs. For drugs such as azoles and echinocandins, the models performed relatively well. In addition, two machine learning algorithms, based on the balanced test and imbalanced test, were designed and evaluated; for most drugs, the evaluation results on the balanced test set were better than on the imbalanced test set. The mutations strongly be associated with drug resistance of C. auris were predicted and ranked by Recursive Feature Elimination with Cross-Validation (RFECV) combined with a machine learning classifier. In addition to known drug resistance mutations, some new resistance mutations were predicted, such as Y501H and I466M mutation in the ERG11 gene and R278H mutation in the ERG10 gene, which may be associated with fluconazole (FCZ), micafungin (MCF), and amphotericin B (AmB) resistance, respectively; these mutations were in the “hot spot” regions of the ergosterol pathway. To sum up, this study suggested that machine learning classifiers are a useful and cost-effective method to identify fungal drug resistance-related mutations, which is of great significance for the research on the resistance mechanism of C. auris.


Author(s):  
Yae Won Park ◽  
Jihwan Eom ◽  
Sooyon Kim ◽  
Hwiyoung Kim ◽  
Sung Soo Ahn ◽  
...  

Abstract Context Early identification of the response of prolactinoma patients to dopamine agonists (DA) is crucial in treatment planning. Objective To develop a radiomics model using an ensemble machine learning classifier with conventional magnetic resonance images (MRIs) to predict the DA response in prolactinoma patients. Design Retrospective study Setting Severance Hospital Patients A total of 177 prolactinoma patients who underwent baseline MRI (109 DA responders and 68 DA non-responders) were allocated to the training (n = 141) and test (n = 36) sets. Radiomic features (n = 107) were extracted from coronal T2-weighed MRIs. After feature selection, single models (random forest, light gradient boosting machine, extra-trees, quadratic discrimination analysis, and linear discrimination analysis) with oversampling methods were trained to predict the DA response. A soft voting ensemble classifier was used to achieve the final performance. The performance of the classifier was validated in the test set. Results The ensemble classifier showed an area under the curve (AUC) of 0.81 (95 % confidence interval [CI], 0.74–0.87) in the training set. In the test set, the ensemble classifier showed an AUC, accuracy, sensitivity, and specificity of 0.81 (95 % CI, 0.67–0.96), 77.8 %, 78.6 %, and 77.3 %, respectively. The ensemble classifier achieved the highest performance among all the individual models in the test set. Conclusions Radiomic features may be useful biomarkers to predict the DA response in prolactinoma patients.


2020 ◽  
Vol 21 (21) ◽  
pp. 8004
Author(s):  
Yu Sakai ◽  
Chen Yang ◽  
Shingo Kihira ◽  
Nadejda Tsankova ◽  
Fahad Khan ◽  
...  

In patients with gliomas, isocitrate dehydrogenase 1 (IDH1) mutation status has been studied as a prognostic indicator. Recent advances in machine learning (ML) have demonstrated promise in utilizing radiomic features to study disease processes in the brain. We investigate whether ML analysis of multiparametric radiomic features from preoperative Magnetic Resonance Imaging (MRI) can predict IDH1 mutation status in patients with glioma. This retrospective study included patients with glioma with known IDH1 status and preoperative MRI. Radiomic features were extracted from Fluid-Attenuated Inversion Recovery (FLAIR) and Diffusion-Weighted-Imaging (DWI). The dataset was split into training, validation, and testing sets by stratified sampling. Synthetic Minority Oversampling Technique (SMOTE) was applied to the training sets. eXtreme Gradient Boosting (XGBoost) classifiers were trained, and the hyperparameters were tuned. Receiver operating characteristic curve (ROC), accuracy, and f1-scores were collected. A total of 100 patients (age: 55 ± 15, M/F 60/40); with IDH1 mutant (n = 22) and IDH1 wildtype (n = 78) were included. The best performance was seen with a DWI-trained XGBoost model, which achieved ROC with Area Under the Curve (AUC) of 0.97, accuracy of 0.90, and f1-score of 0.75 on the test set. The FLAIR-trained XGBoost model achieved ROC with AUC of 0.95, accuracy of 0.90, f1-score of 0.75 on the test set. A model that was trained on combined FLAIR-DWI radiomic features did not provide incremental accuracy. The results show that a XGBoost classifier using multiparametric radiomic features derived from preoperative MRI can predict IDH1 mutation status with > 90% accuracy.


2021 ◽  
Vol 11 ◽  
Author(s):  
Alix de Causans ◽  
Alexandre Carré ◽  
Alexandre Roux ◽  
Arnault Tauziède-Espariat ◽  
Samy Ammari ◽  
...  

ObjectivesTo differentiate Glioblastomas (GBM) and Brain Metastases (BM) using a radiomic features-based Machine Learning (ML) classifier trained from post-contrast three-dimensional T1-weighted (post-contrast 3DT1) MR imaging, and compare its performance in medical diagnosis versus human experts, on a testing cohort.MethodsWe enrolled 143 patients (71 GBM and 72 BM) in a retrospective bicentric study from January 2010 to May 2019 to train the classifier. Post-contrast 3DT1 MR images were performed on a 3-Tesla MR unit and 100 radiomic features were extracted. Selection and optimization of the Machine Learning (ML) classifier was performed using a nested cross-validation. Sensitivity, specificity, balanced accuracy, and area under the receiver operating characteristic curve (AUC) were calculated as performance metrics. The model final performance was cross-validated, then evaluated on a test set of 37 patients, and compared to human blind reading using a McNemar’s test.ResultsThe ML classifier had a mean [95% confidence interval] sensitivity of 85% [77; 94], a specificity of 87% [78; 97], a balanced accuracy of 86% [80; 92], and an AUC of 92% [87; 97] with cross-validation. Sensitivity, specificity, balanced accuracy and AUC were equal to 75, 86, 80 and 85% on the test set. Sphericity 3D radiomic index highlighted the highest coefficient in the logistic regression model. There were no statistical significant differences observed between the performance of the classifier and the experts’ blinded examination.ConclusionsThe proposed diagnostic support system based on radiomic features extracted from post-contrast 3DT1 MR images helps in differentiating solitary BM from GBM with high diagnosis performance and generalizability.


2021 ◽  
Author(s):  
Guo ZiYi ◽  
Jiang LiPeng ◽  
Zhu ZhiTu

Abstract OBJECTIVE. The purpose of this study is to evaluate the potential value of CT radiomics in predicting the mutation status of β-catenin in patients with hepatic cell cancer (HCC).MATERIALS AND METHODS.In this retrospective study, 43 patients with hepatic cell HCC (18 without β-catenin mutation and 15 with β-catenin mutation) were identified in The Cancer Genome Atlas–hepatic liver Cell Carcinoma database (TCGA-LIHC). To create stable models, the data were augmented to a total of 202 labeled samples (131 without β-catenin mutation and 73 with β-catenin mutation) by obtaining up to five different samples per patient. Extraction of large amounts of image features from portal phase contrast-enhanced CT images had been performed on an open-source software package (Pyradiomics, version 2.1.2.). Reproducibility analysis (intraclass correlation, run ICCs in SPSS 18.0) was performed by two radiologists. Classification problem is about β-catenin gene mutation status. Machine Learning based classifications were performed using the Pycaret (version 2.1.2) software. The main performance metric was the AUC value.RESULTS. Of 828 extracted texture features, 759 had excellent reproducibility. Using 10 selected features, the Extra Trees Classifier algorithm correctly classified 93.4% of the HCCs in terms of β-catenin mutation status (AUC value, 0.9741); the CatBoost Classifier algorithm correctly classified 91.9% of the HCCs (AUC value, 0.9692); Gradient Boosting Classifier algorithm correctly classified 91.1% ( AUC value, 0.9722). All the three advanced algorithms performed above 90% accuracy.CONCLUSION. Machine Learning-based high-dimensional quantitative CT radiomics analysis might be a feasible and potential method for predicting β-catenin mutation status in patients with HCC.


2020 ◽  
Author(s):  
Tamar Amitai ◽  
Yoav Kan-Tor ◽  
Naama Srebnik ◽  
Amnon Buxboim

ABSTRACTObjectiveDevelop a machine learning classifier for predicting the risk of cleavage-stage embryos to undergo first trimester miscarriage based on time-lapse images of preimplantation development.DesignRetrospective study of a 4-year multi-center cohort of women undergoing intra-cytoplasmatic sperm injection (ICSI). The study included embryos with positive indication of clinical implantation based on gestational sac visualization either with first trimester miscarriage or live birth outcome. Miscarriage was determined based on negative fetal heartbeat indication during the first trimester.SettingData were recorded and obtained in hospital setting and research was performed in university setting.Patient(s)Data from 391 women who underwent fresh single or double embryo transfers were included.Intervention(s)None.Main Outcome Measure(s)A minimal subset of six non-redundant morphodynamic features were screened that maintain high prediction capacity. Using this feature subset, XGBoost and Random Forest models were trained following a 100-fold Monte-Carlo cross validation scheme. Feature importance was scored using the SHapley Additive exPlanations (SHAP) methodology. Miscarriage versus live-birth outcome prediction was evaluated using a non-contaminated balanced test set and quantified in terms of the area under the receiver operating characteristic (ROC) curve (AUC), precision-recall curve, positive predictive value (PPV), and confusion matrices.Result(s)Features that account for the distribution of the nucleolus precursor bodies within the small pronucleus and pronuclei dynamics were highly predictive of miscarriage outcome. AUC for miscarriage prediction of validation and test set embryos using both models was 0.68-to-0.69. Clinical utility was tested by setting two classification thresholds accounting for high sensitivity 0.73 with 0.6 specificity and high specificity 0.93 with 0.33 sensitivity.Conclusion(s)We report the development of a decision-support tool for identifying the embryos with high risk of miscarriage. Prioritizing embryos for transfer based on their predicted risk of miscarriage in combination with their predicted implantation potential will improve live-birth rates and shorten time-to-pregnancy.CapsuleThe risk of first trimester miscarriage of cleavage stage embryos is predicted with AUC 68% by screening a minimal subset of six non-redundant morpho-dynamic features and training a machine-learning classifier.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1436
Author(s):  
Tuoru Li ◽  
Senxiang Lu ◽  
Enjie Xu

The internal detector in a pipeline needs to use the ground marker to record the elapsed time for accurate positioning. Most existing ground markers use the magnetic flux leakage testing principle to detect whether the internal detector passes. However, this paper uses the method of detecting vibration signals to track and locate the internal detector. The Variational Mode Decomposition (VMD) algorithm is used to extract features, which solves the defect of large noise and many disturbances of vibration signals. In this way, the detection range is expanded, and some non-magnetic flux leakage internal detectors can also be located. Firstly, the extracted vibration signals are denoised by the VMD algorithm, then kurtosis value and power value are extracted from the intrinsic mode functions (IMFs) to form feature vectors, and finally the feature vectors are input into random forest and Multilayer Perceptron (MLP) for classification. Experimental research shows that the method designed in this paper, which combines VMD with a machine learning classifier, can effectively use vibration signals to locate the internal detector and has the characteristics of high accuracy and good adaptability.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Joshua E. Lewis ◽  
Melissa L. Kemp

AbstractResistance to ionizing radiation, a first-line therapy for many cancers, is a major clinical challenge. Personalized prediction of tumor radiosensitivity is not currently implemented clinically due to insufficient accuracy of existing machine learning classifiers. Despite the acknowledged role of tumor metabolism in radiation response, metabolomics data is rarely collected in large multi-omics initiatives such as The Cancer Genome Atlas (TCGA) and consequently omitted from algorithm development. In this study, we circumvent the paucity of personalized metabolomics information by characterizing 915 TCGA patient tumors with genome-scale metabolic Flux Balance Analysis models generated from transcriptomic and genomic datasets. Metabolic biomarkers differentiating radiation-sensitive and -resistant tumors are predicted and experimentally validated, enabling integration of metabolic features with other multi-omics datasets into ensemble-based machine learning classifiers for radiation response. These multi-omics classifiers show improved classification accuracy, identify clinical patient subgroups, and demonstrate the utility of personalized blood-based metabolic biomarkers for radiation sensitivity. The integration of machine learning with genome-scale metabolic modeling represents a significant methodological advancement for identifying prognostic metabolite biomarkers and predicting radiosensitivity for individual patients.


2021 ◽  
Vol 1770 (1) ◽  
pp. 012012
Author(s):  
P. Asha ◽  
A. Jesudoss ◽  
S. Prince Mary ◽  
K. V. Sai Sandeep ◽  
K. Harsha Vardhan

Sign in / Sign up

Export Citation Format

Share Document