Development and Validation of a 23-Gene Signature for Prognosis Prediction in Lung Adenocarcinoma
Abstract Background: Lung cancer remains the most fatal tumorous disease in the worldwide. Among that, lung adenocarcinoma (LUAD) was the most common histological type. A precise and concise prognostic model was urgently needed of LUAD. We developed a 23-gene signature for prognosis prediction based on EMT, immune and stromal datasets.Methods: Univariate Cox regression analysis was performed to select genes which were significantly associated with overall survival (OS) of the TCGA LUAD cohorts. LASSO regression and multivariate Cox regression analysis was used to build the multi-gene signature. Enrichment analyses and a protein-protein interactions (PPI) network were performed to show the interaction and functions of the signature. A nomogram was developed based on risk score and other clinical features. Predictive performance of the signature was externally validated in two independent datasets from Gene Expression Omnibus (GSE37745 and GSE13213).Results: A total of 1334 EMT, immune and stromal associated genes were obtained. After LASSO regression and multivariate Cox regression analysis, a 23-gene signature for risk stratification was built. K-M curves showed that the patients with high risk had a poorer outcome. Finally, a nomogram was built to predict prognosis. The predictive performance of the 23-gene signature was confirmed in internal and external validation.Conclusion: We developed and verified a 23-gene signature based on EMT, immune and stromal gene sets. It provided a convenient and concise tool for risk stratificationand individual medicine.