WEIGHTED NEIGHBORHOOD CLASSIFIER FOR THE CLASSIFICATION OF IMBALANCED TUMOR DATASET

Machine learning is widely applied to gene expression profiles based molecular tumor classification, but sample imbalance problem is often overlooked. This paper proposed a subclass-weighted neighborhood classifier to address the imbalanced sample set problem and a novel neighborhood rough set model to select informative genes for classification performance improvement. Experiments on three publicly available tumor datasets demonstrated that the proposed method is obviously effective on imbalanced dataset with obscure boundary between two subtypes and informative gene selection and it can achieve higher cross-validation accuracy with much fewer tumor-related genes.

Download Full-text

Heuristic Breadth-First Search Algorithm for Informative Gene Selection Based on Gene Expression Profiles

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2008.00636 ◽

2009 ◽

Vol 31 (4) ◽

pp. 636-649 ◽

Cited By ~ 3

Author(s):

Shu-Lin WANG ◽

Ji WANG ◽

Huo-Wang CHEN ◽

Shu-Tao LI ◽

Bo-Yun ZHANG

Keyword(s):

Gene Expression ◽

Gene Selection ◽

Search Algorithm ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Informative Gene ◽

Breadth First Search

Download Full-text

Gene Selection Using Neighborhood Rough Set from Gene Expression Profiles

2007 International Conference on Computational Intelligence and Security (CIS 2007) ◽

10.1109/cis.2007.169 ◽

2007 ◽

Cited By ~ 3

Author(s):

Shulin Wang ◽

Huowang Chen ◽

Shutao Li

Keyword(s):

Gene Expression ◽

Rough Set ◽

Gene Selection ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Neighborhood Rough Set

Download Full-text

Study of Informative Gene Selection for Gene Expression Profiles

2009 WRI Global Congress on Intelligent Systems ◽

10.1109/gcis.2009.94 ◽

2009 ◽

Author(s):

Quanzhong Liu ◽

Yang Zhang ◽

Yong Wang ◽

Zhengguo Hu

Keyword(s):

Gene Expression ◽

Gene Selection ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Informative Gene ◽

Selection For

Download Full-text

Informative Gene Selection Based on Cost-Sensitive Fast Correlation-Based Feature Selection

Current Bioinformatics ◽

10.2174/1574893616666210601111850 ◽

2021 ◽

Vol 16 ◽

Author(s):

Yueling Xiong ◽

Qingqing Li ◽

Peipei Wang ◽

Mingquan Ye

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Candidate Gene ◽

Gene Selection ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Informative Gene ◽

Gene Subset ◽

Correlation Based Feature Selection ◽

Selection Algorithms

Background: Informative gene selection is an essential step in performing tumor classification. However, it is difficult to select informative genes related to tumors from large-scale gene expression profiles because of their characteristics, such as high dimensionality, relatively small samples, and class imbalance, and some genes being superfluous and irrelevant. Objective: Many researchers analyze and process gene expression data to obtain classified gene subsets by using machine learning methods. However, the gene expression profiles of tumors often have massive computational challenges. In addition, when improving feature importance and classification accuracy, cost estimation is often ignored in traditional feature selection algorithms, which makes tumor classification more difficult. Method: In this study, a novel informative gene selection method based on cost-sensitive fast correlation-based feature selection (CS-FCBF) is proposed. Results: First, the symmetric uncertainty index is used to evaluate the correlation between informative genes and class labels, and then a large number of irrelevant and redundant genes are quickly filtered according to importance. Thereby, a candidate gene subset is generated. Second, cost-sensitive learning, which introduces the misclassification cost matrix and support vector machine attribute evaluation, is used to obtain the top-ranked gene subset with minimum misclassification loss. Finally, the candidate gene subset is optimized. Conclusion: This experiment was verified in eight independent tumor datasets. By comparing and analyzing CS-FCBF with another three hybrids of typical gene selection algorithms combined with cost-sensitive learning, we found that the method proposed in this study exhibited a better classification performance with fewer selected genes, which might provide guidance in tumor diagnosis and research.

Download Full-text

Parallelized Classification of Cancer Sub-types from Gene Expression Profiles Using Recursive Gene Selection

Studies in Informatics and Control ◽

10.24846/v27i2y201809 ◽

2019 ◽

Vol 27 (2) ◽

pp. 213-222 ◽

Cited By ~ 1

Author(s):

Lokeswari VENKATARAMANA ◽

Shomona Gracia JACOB ◽

Rajavel RAMADOSS

Keyword(s):

Gene Expression ◽

Gene Selection ◽

Expression Profiles ◽

Gene Expression Profiles

Download Full-text

Dorsal Root Ganglion Neuron Types and Their Functional Specialization

The Oxford Handbook of the Neurobiology of Pain ◽

10.1093/oxfordhb/9780190860509.013.4 ◽

2018 ◽

pp. 127-155 ◽

Cited By ~ 3

Author(s):

Edward C. Emery ◽

Patrik Ernfors

Keyword(s):

Chronic Pain ◽

Dorsal Root Ganglion ◽

Sensory Neurons ◽

Dorsal Root ◽

Molecular Mechanisms ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Drg Neurons ◽

Low Threshold

Primary sensory neurons of the dorsal root ganglion (DRG) respond and relay sensations that are felt, such as those for touch, pain, temperature, itch, and more. The ability to discriminate between the various types of stimuli is reflected by the existence of specialized DRG neurons tuned to respond to specific stimuli. Because of this, a comprehensive classification of DRG neurons is critical for determining exactly how somatosensation works and for providing insights into cell types involved during chronic pain. This article reviews the recent advances in unbiased classification of molecular types of DRG neurons in the perspective of known functions as well as predicted functions based on gene expression profiles. The data show that sensory neurons are organized in a basal structure of three cold-sensitive neuron types, five mechano-heat sensitive nociceptor types, four A-Low threshold mechanoreceptor types, five itch-mechano-heat–sensitive nociceptor types and a single C–low-threshold mechanoreceptor type with a strong relation between molecular neuron types and functional types. As a general feature, each neuron type displays a unique and predicable response profile; at the same time, most neuron types convey multiple modalities and intensities. Therefore, sensation is likely determined by the summation of ensembles of active primary afferent types. The new classification scheme will be instructive in determining the exact cellular and molecular mechanisms underlying somatosensation, facilitating the development of rational strategies to identify causes for chronic pain.

Download Full-text

A Robust Gene selection Method for Microarray-based Cancer Classification

Cancer Informatics ◽

10.4137/cin.s3794 ◽

2010 ◽

Vol 9 ◽

pp. CIN.S3794 ◽

Cited By ~ 21

Author(s):

Xiaosheng Wang ◽

Osamu Gotoh

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Selection ◽

Information Gain ◽

Expression Profiles ◽

Feature Selection Method ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Selection Method ◽

Chi Square

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.

Download Full-text

Gene Selection in Cancer Classification Using Sparse Logistic Regression with L1/2 Regularization

Applied Sciences ◽

10.3390/app8091569 ◽

2018 ◽

Vol 8 (9) ◽

pp. 1569 ◽

Cited By ~ 3

Author(s):

Shengbing Wu ◽

Hongkun Jiang ◽

Haiwei Shen ◽

Ziyi Yang

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Classification Performance ◽

Cancer Classification ◽

Sparse Logistic Regression ◽

The Subject ◽

Selection For ◽

Microarray Datasets ◽

Sparse Methods

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.

Download Full-text