Machine learning based predictive model of early mortality in stage III and IV prostate cancer
AbstractProstate cancer remains the third highest cause of cancer-related deaths. Metastatic prostate cancer could yield poor prognosis, however there is limited work on predictive models for clinical decision support in stage III and IV prostate cancer.We developed a machine learning model for predicting early mortality in prostate cancer (survival less than 21 months after initial diagnosis). A cohort of 10,303 patients was extracted from the Surveillance, Epidemiology and End Results (SEER) program. Features were constructed in several domains including demographics, histology of primary tumor, and metastatic sites. Feature selection was performed followed by regularized logistic regression. The model was evaluated using 5-fold cross validation and achieved 75.2% accuracy with AUC 0.649. Of the 19 most predictive features, all of them were validated to be clinically meaningful for prediction of early mortality.Our study serves as a framework for prediction of early mortality in patients with stage II and stage IV prostate cancer, and can be generalized to predictive modeling problems for other relevant clinical endpoints. Future work should involve integration of other data sources such as electronic health record and genomic or metabolomic data.