Lung Cancer Types Prediction Using Machine Learning Approach

Author(s):  
Kshitija Ingle ◽  
Uttam Chaskar ◽  
Shashikant Rathod
2021 ◽  
Author(s):  
Moses P Cook ◽  
Bessi Qorri ◽  
Amruth Baskar ◽  
Jalal Ziauddin ◽  
Luca Pani ◽  
...  

There are many small datasets of significant value in the medical space that are being underutilized. Due to the heterogeneity of complex disorders found in oncology, systems capable of discovering patient subpopulations while elucidating etiologies is of great value as it can indicate leads for innovative drug discovery and development. Here, we report on a machine intelligence-based study that utilized a combination of two small non-small cell lung cancer (NSCLC) datasets consisting of 58 samples of adenocarcinoma (ADC) and squamous cell carcinoma (SCC) and 45 samples from the gene expression analysis of human lung cancer and control samples series (GSE18842). Utilizing a novel machine learning approach, we were able to uncover subpopulations of ADC and SCC while simultaneously extracting which genes, in combination, were significantly involved in defining the subpopulations. An interactive hypothesis-generating interface designed to work with machine learning methods allowed us to explore the hypotheses generated by the unsupervised components of the system. Using these methods, we were able to uncover genes implicated by other methods and accurately discover known subpopulations without being asked, such as different levels of aggressiveness within the SCC and ADC subtypes. Furthermore, PIGX was a novel gene implicated in this study that warrants further study due to its role in breast cancer proliferation. Here we demonstrate the ability to learn from small datasets and reveal well-established properties of NSCLC. These machine learning techniques can reveal the driving factors behind subpopulations of patients altering the approach to drug discovery and development by making precision medicine a reality.


2021 ◽  
Vol 17 (4) ◽  
pp. e1008898
Author(s):  
Rasool Saghaleyni ◽  
Azam Sheikh Muhammad ◽  
Pramod Bangalore ◽  
Jens Nielsen ◽  
Jonathan L. Robinson

Deregulation of the protein secretory pathway (PSP) is linked to many hallmarks of cancer, such as promoting tissue invasion and modulating cell-cell signaling. The collection of secreted proteins processed by the PSP, known as the secretome, is often studied due to its potential as a reservoir of tumor biomarkers. However, there has been less focus on the protein components of the secretory machinery itself. We therefore investigated the expression changes in secretory pathway components across many different cancer types. Specifically, we implemented a dual approach involving differential expression analysis and machine learning to identify PSP genes whose expression was associated with key tumor characteristics: mutation of p53, cancer status, and tumor stage. Eight different machine learning algorithms were included in the analysis to enable comparison between methods and to focus on signals that were robust to algorithm type. The machine learning approach was validated by identifying PSP genes known to be regulated by p53, and even outperformed the differential expression analysis approach. Among the different analysis methods and cancer types, the kinesin family members KIF20A and KIF23 were consistently among the top genes associated with malignant transformation or tumor stage. However, unlike most cancer types which exhibited elevated KIF20A expression that remained relatively constant across tumor stages, renal carcinomas displayed a more gradual increase that continued with increasing disease severity. Collectively, our study demonstrates the complementary nature of a combined differential expression and machine learning approach for analyzing gene expression data, and highlights key PSP components relevant to features of tumor pathophysiology that may constitute potential therapeutic targets.


Cancers ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 1562 ◽  
Author(s):  
Maurizio Polano ◽  
Marco Chierici ◽  
Michele Dal Bo ◽  
Davide Gentilini ◽  
Federica Di Cintio ◽  
...  

Immunotherapy by using immune checkpoint inhibitors (ICI) has dramatically improved the treatment options in various cancers, increasing survival rates for treated patients. Nevertheless, there are heterogeneous response rates to ICI among different cancer types, and even in the context of patients affected by a specific cancer. Thus, it becomes crucial to identify factors that predict the response to immunotherapeutic approaches. A comprehensive investigation of the mutational and immunological aspects of the tumor can be useful to obtain a robust prediction. By performing a pan-cancer analysis on gene expression data from the Cancer Genome Atlas (TCGA, 8055 cases and 29 cancer types), we set up and validated a machine learning approach to predict the potential for positive response to ICI. Support vector machines (SVM) and extreme gradient boosting (XGboost) models were developed with a 10×5-fold cross-validation schema on 80% of TCGA cases to predict ICI responsiveness defined by a score combining tumor mutational burden and TGF- β signaling. On the remaining 20% validation subset, our SVM model scored 0.88 accuracy and 0.27 Matthews Correlation Coefficient. The proposed machine learning approach could be useful to predict the putative response to ICI treatment by expression data of primary tumors.


2018 ◽  
Vol 36 (15_suppl) ◽  
pp. 6589-6589
Author(s):  
Gabriel A. Brooks ◽  
Nancy Lynn Keating ◽  
Savannah L Bergquist ◽  
Mary Beth Landrum ◽  
Sherri Rose

Sign in / Sign up

Export Citation Format

Share Document