Aim:
To search the genes related to the mechanisms of the occurrence of glioma and to try to build a prediction
model for glioblastomas.
Background:
The morbidity and mortality of glioblastomas are very high, which seriously endangers human health. At
present, the goals of many investigations on gliomas are mainly to understand the cause and mechanism of these tumors at
the molecular level and to explore clinical diagnosis and treatment methods. However, there is no effective early diagnosis
method for this disease, and there are no effective prevention, diagnosis or treatment measures.
Methods:
First, the gene expression profiles derived from GEO were downloaded. Then, differentially expressed genes
(DEGs) in the disease samples and the control samples were identified. After that, GO and KEGG enrichment analyses of
DEGs were performed by DAVID. Furthermore, the correlation-based feature subset (CFS) method was applied to the
selection of key DEGs. In addition, the classification model between the glioblastoma samples and the controls was built
by an Support Vector Machine (SVM) based on selected key genes.
Results and Discussion:
Thirty-six DEGs, including 17 upregulated and 19 downregulated genes, were selected as the
feature genes to build the classification model between the glioma samples and the control samples by the CFS method.
The accuracy of the classification model by using a 10-fold cross-validation test and independent set test was 76.25% and
70.3%, respectively. In addition, PPP2R2B and CYBB can also be found in the top 5 hub genes screened by the protein–
protein interaction (PPI) network.
Conclusions:
This study indicated that the CFS method is a useful tool to identify key genes in glioblastomas. In addition,
we also predicted that genes such as PPP2R2B and CYBB might be potential biomarkers for the diagnosis of
glioblastomas.