Colon Cancer Classification and Prognosis Prediction Based on Genomics Multi-features
Abstract Background: To classify colon cancer and predict the prognosis of patients with multiple characteristics of the genome.Methods: We used the mRNA expression profile data and mutation maf files of colon cancer patients in the TCGA database to calculate the TMB value of patients. Combined with CNV, MSI, and corresponding clinical information, the patients were clustered by the "K-means" method to identify different molecular subtypes of colon cancer. Comparing the differences of prognosis, and immune cell infiltration, and other indicators among patients in each subgroup, we used COX and lasso regression analysis to screen out the prognosis difference genes among subgroups and construct the prognosis prediction model. We used the external data set to verify the model, and carried out the hierarchical analysis of the model to compare the immune infiltration of patients in the high and low-risk groups. And detected the expression differences of core genes in tumor tissues of patients with different clinical stages by qPCR and immunohistochemistry.Results: We successfully calculated the TMB value and divided the patients into three subgroups. The prognosis of the second subgroup was significantly different from the other two groups. The mmunoinfiltration analysis showed that the expression of NK.cells.resting increased in cluster1 and cluster 3, and the expression of T.cells.CD4.memory.resting increased in cluster3. By analyzing the differences among subgroups, we screened out eight core genes related to prognoses, such as HYAL1, SPINK4, EREG, and successfully constructed a patient prognosis evaluation model. The test results of the external data set shows that the model can accurately predict the prognosis of patients; Compared with risk factors such as TNM stage and age, the risk score of the model has higher evaluation efficiency. The experimental results confirmed that the differential expression of eight core genes was basically consistent with the model evaluation results.Conclusion: Colon cancer patients were further divided into three subtypes by using genomic multi-features, and eight-core genes related to prognosis were screened out and the prognosis evaluation model was successfully constructed. With external data and experiments, it verified that the model had good evaluation efficiency.