A machine learning texture model for classifying lung cancer subtypes using preliminary bronchoscopic findings

2018 ◽  
Vol 45 (12) ◽  
pp. 5509-5514 ◽  
Author(s):  
Po‐Hao Feng ◽  
Yin‐Tzu Lin ◽  
Chung‐Ming Lo

Lung cancer is one of the diseases which has a high mortality. If the condition is detected earlier, then it is easier to reduce the mortality rate. This lung cancer has caused more deaths in the world than any other cancer. The main objective is to predict lung cancer using a machine learning algorithm. Several computer-aided systems have been designed to reduce the mortality rate due to lung cancer. Machine learning is a promising tool to predict lung cancer in its early phase or stage, where the features of images are trained using a classification model. Generally, machine learning is used to have a good prediction, but in some models, due to lack of efficient feature extraction value, the training has not been done more effectively; hence the predictions are poor. In order to overcome this limitation, the proposed covariant texture model utilizing the steerable Riesz wavelets feature extraction technique to increase the effectiveness of training via the Random Forest algorithm. In this proposed model, the RF algorithm is employed to predict whether the nodule in the image is benign or malignant ii) to find the level of severity (1 to 5), if it is a malignant nodule. Our experiment result can be used as a tool to support the diagnosis and to analyze at an earlier stage of cancer to cure it.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e24042-e24042
Author(s):  
Ayse Ece Cali Daylan ◽  
Danai Khemasuwan ◽  
Hyun S. Kim ◽  
Parvathy Geetha ◽  
Sylvia Vania Alarcon Velasco ◽  
...  

e24042 Background: The increased risk of venous thromboembolism (VTE) in cancer patients is clearly documented. However, given the heterogeneity and increased risk of bleeding in cancer population, patient selection for thromboprophylaxis is still challenging. Methods: In order to predict risk factors of VTE in cancer patients, we performed a retrospective study of 706 patients who were diagnosed with either solid or hematological malignancies between 2015 and 2019. Demographics, body mass index, complete blood count with differential, kidney function tests, electrolytes, liver function tests, lipid profile and cancer staging were recorded. Random forest analysis with bagging was used to rank these variables and the Kaplan-Meier survival analysis was implemented to stratify cancer subtypes based on the risk of VTE occurrence. Results: The mean follow-up time was 19 months. 8.2% of the patients developed VTE. Based on the random forest analysis, the most important five factors in prediction of VTE in cancer patients were determined as cancer subtype, white blood cell count, platelets, neutrophil and hemoglobin. At one-year mark, the risk of VTE in lung cancer and hematological malignancies was found to be significantly higher than breast, colorectal and endometrial cancer (p<0.05). Conclusions: Machine learning approach is infrequently used in risk factor prediction of VTE in cancer patients. The risk factors identified by the machine learning algorithm in our study are consistent with prior studies and show a clear difference in risk of VTE in various cancer subtypes. Moreover, hematological malignancies and lung cancer patients may develop VTE earlier than other cancer subtypes based on the Kaplan-Meier analysis. Further prospective studies with longer follow up are needed to better risk-stratify cancer patients and explore the temporal associations of VTE risk factors. [Table: see text]


2021 ◽  
Author(s):  
Joseph M. Dhahbi ◽  
joe w. Chen

Abstract Lung cancer is one of the deadliest cancers in the world. Two of the most common subtypes, lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC), have drastically different biological signatures, yet they are often treated similarly and classified together as non-small cell lung cancer (NSCLC). LUAD and LUSC biomarkers are scarce and their distinct biological mechanisms have yet to be elucidated. Many studies have attempted to improve traditional machine learning algorithms or develop novel algorithms to identify biomarkers, but few have used overlapping machine learning or feature selection methods for cancer classification, biomarker identification, or pathway analysis. This study proposes selecting overlapping features as a way to differentiate between cancer subtypes, especially between LUAD and LUSC. Overall, this method achieved classification results comparable to, if not better than, the traditional algorithms. It also identified multiple known biomarkers, and five potentially novel biomarkers with high discriminating values between the two subtypes. Many of the biomarkers also exhibit significant prognostic potential, particularly in LUAD. Our study also unraveled distinct biological pathways between LUAD and LUSC.


2021 ◽  
Vol 16 ◽  
Author(s):  
Javier Bajo-Morales ◽  
Juan Manuel Galvez ◽  
Juan Carlos Prieto-Prieto ◽  
Luis Javier Herrera ◽  
Ignacio Rojas ◽  
...  

Background: Nowadays, gene expression analysis is one of the most promising pillars for understanding and uncovering the mechanisms underlying the development and spread of cancer . In this sense, Next Generation Sequencing technologies, such as RNA-Seq, are currently leading the market due to their precision and cost. Nevertheless, there is still an enormous amount of non-analyzed data obtained from older technologies, such as Microarray, which could still be useful to extract relevant knowledge. Methods: Throughout this research, a complete machine learning methodology to cross-evaluate the compatibility between both RNA-Seq and Microarray sequencing technologies is described and implemented. In order to show a real application of the designed pipeline, a lung cancer case study is addressed by considering two detected subtypes: adenocarcinoma and squamous cell carcinoma. Transcriptomic datasets considered for our study have been obtained from the public repositories NCBI/GEO, ArrayExpress and GDC-Portal. From them, several gene experiments have been carried out with the aim of finding gene signatures for these lung cancer subtypes, linked to both transcriptomic technologies. With these DEGs selected, intelligent predictive models capable of classifying new samples belonging to these cancer subtypes were developed. Results: The predictive models built using one technology are capable of discerning samples from a different technology. The classification results are evaluated in terms of accuracy, F1-score and ROC curves along with AUC. Finally, the biological information of the gene sets obtained and their relationship with lung cancer is reviewed, encountering strong biological evidence linking them to the disease. Conclusion: Our method has the capability of finding strong gene signatures which are also independent of the transcriptomic technology used to develop the analysis. In addition, our article highlighted the potential of using heterogeneous transcriptomic data to increase the amount of samples for the studies, increasing the statistical significance of the results.


Sign in / Sign up

Export Citation Format

Share Document