Histological classification of non-small cell lung cancer with RNA-seq data using machine learning models

2021 ◽  
Author(s):  
Robert B. Eshun ◽  
Md Khurram Monir Rabby ◽  
A. K. M. Kamrul Islam ◽  
Marwan U. Bikdash
2021 ◽  
Author(s):  
Sébastien Benzekry ◽  
Mathieu Grangeon ◽  
Mélanie Karlsen ◽  
Maria Alexa ◽  
Isabella Bicalho-Frazeto ◽  
...  

ABSTRACTBackgroundImmune checkpoint inhibitors (ICIs) are now a therapeutic standard in advanced non-small cell lung cancer (NSCLC), but strong predictive markers for ICIs efficacy are still lacking. We evaluated machine learning models built on simple clinical and biological data to individually predict response to ICIs.MethodsPatients with metastatic NSCLC who received ICI in second line or later were included. We collected clinical and hematological data and studied the association of this data with disease control rate (DCR), progression free survival (PFS) and overall survival (OS). Multiple machine learning (ML) algorithms were assessed for their ability to predict response.ResultsOverall, 298 patients were enrolled. The overall response rate and DCR were 15.3 % and 53%, respectively. Median PFS and OS were 3.3 and 11.4 months, respectively. In multivariable analysis, DCR was significantly associated with performance status (PS) and hemoglobin level (OR 0.58, p<0.0001; OR 1.8, p<0.001). These variables were also associated with PFS and OS and ranked top in random forest-based feature importance. Neutrophils-to-lymphocytes ratio was also associated with DCR, PFS and OS. The best ML algorithm was a random forest. It could predict DCR with satisfactory efficacy based on these three variables. Ten-fold cross-validated performances were: accuracy 0.68 ± 0.04, sensitivity 0.58 ± 0.08; specificity 0.78 ± 0.06; positive predictive value 0.70 ± 0.08; negative predictive value 0.68 ± 0.06; AUC 0.74 ± 0.03.ConclusionCombination of simple clinical and biological data could accurately predict disease control rate at the individual level.Highlights-Machine learning applied to a large set of NSCLC patients could predict efficacy of immunotherapy with a 69% accuracy using simple routine data-Hemoglobin levels and performance status were the strongest predictors and significantly associated with DCR, PFS and OS-Neutrophils-to-lymphocyte ratio was also associated with outcome-Benchmark of 8 machine learning models


2021 ◽  
Vol 66 ◽  
pp. 102446
Author(s):  
Ewelina Bębas ◽  
Marta Borowska ◽  
Marcin Derlatka ◽  
Edward Oczeretko ◽  
Marcin Hładuński ◽  
...  

2021 ◽  
Author(s):  
Haike Lei ◽  
Chun Liu ◽  
Zheng Xu ◽  
Na Hong ◽  
Xiaosheng Li ◽  
...  

Abstract BackgroundPatients with non-small cell lung cancer (NSCLC) often have a poor prognosis. Overall survival (OS) prediction through the early diagnosis of cancer has many benefits, such as allowing providers to design the best treatment plan for patients. In this study, we aimed to evaluate the prognostic factors in NSCLC patients, construct a nomogram, and develop machine learning models to predict the OS. We also conducted feature importance analysis to understand how relevant factors of NSCLC patients impact their OS.ResultsMultiple machine learning models were adopted in a retrospective cohort of patients from 2010 to 2015 in the Surveillance, Epidemiology, and End Results (SEER) database. Independent prognostic factors for NSCLC were determined using Cox proportional hazards regression analysis. We modeled OS and vital status as the outcomes and constructed and validated a nomogram to predict the OS of NSCLC. Furthermore, we applied logistic regression, random forest, XGBoost, decision tree, multilayer perceptron, and LightGBM to predict the patients’ vital status. We tested the prediction ability of the models and evaluated their performances using accuracy, sensitivity, specificity, precision, and the area under the receiver operating characteristic curve. A total of 34,567 patients selected from the SEER database that met our criteria were included in this study. The nomogram visualized the OS prediction results of the Cox regression model. Among the classifiers, XGBoost had the best prediction performance, with an area under the curve of 0.733.ConclusionsThe results demonstrated that machine learning-based classifier models are capable of predicting the outcomes of patients with NSCLC. And Cox regression model-based nomogram interpreted the results well and supports potential medical applications.


2020 ◽  
Vol 2 (6) ◽  
Author(s):  
Siddhant Jain ◽  
Jalal Ziauddin ◽  
Paul Leonchyk ◽  
Shashibushan Yenkanchi ◽  
Joseph Geraci

PLoS ONE ◽  
2014 ◽  
Vol 9 (2) ◽  
pp. e88300 ◽  
Author(s):  
Bi-Qing Li ◽  
Jin You ◽  
Tao Huang ◽  
Yu-Dong Cai

Sign in / Sign up

Export Citation Format

Share Document