Abstract
Background:
Early-stage non-small cell lung cancer (NSCLC) is being diagnosed increasingly, and in 30% of diagnosed patients, recurrence will develop within 5 years. Thus, it is urgent to identify recurrence-related markers in order to optimize the management of patient-tailored therapeutics. The aim of the study was to develop a feasible tool to optimize the recurrence prediction of stage I NSCLC.
Methods:
The eligible datasets were downloaded from TCGA and GEO. In discovery phase, two algorithms, Least Absolute Shrinkage and Selector Operation and Support Vector Machine-Recursive Feature Elimination, were used to identify candidate genes. Recurrence associated signature was developed by penalized cox regression. The nomogram was constructed and further tested via two independent cohorts.
Results:
In this retrospective study, 14 eligible datasets and 7 published signatures were included. In discovery phase, 42 significant genes were highlighted as candidate predictors by two algorithms. A 13-gene based signature was generated by penalized cox regression categorized training cohort into high-risk and low-risk subgroups (HR = 8.873, 95% CI:4.228–18.480 P < 0.001). Furthermore, a nomogram integrating the recurrence related signature, age, and histology was developed to predict the recurrence-free survival in the training cohort, which performed well in the two external validation cohorts (concordance index: 0.737, 95%CI:0.732–0.742, P < 0.001; 0.666, 95%CI: 0.650–0.682, P < 0.001; 0.651, 95%CI:0.637–0.665, P < 0.001 respectively).
Conclusions:
The proposed nomogram is a promising tool for estimating recurrence free survival in stage I NSCLC, which might have tremendous value in guiding adjuvant therapy. Prospective studies are needed to test the clinical utility of the nomogram in individualized management of stage I NSCLC.