Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles

A Robust Gene selection Method for Microarray-based Cancer Classification

Cancer Informatics ◽

10.4137/cin.s3794 ◽

2010 ◽

Vol 9 ◽

pp. CIN.S3794 ◽

Cited By ~ 21

Author(s):

Xiaosheng Wang ◽

Osamu Gotoh

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Selection ◽

Information Gain ◽

Expression Profiles ◽

Feature Selection Method ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Selection Method ◽

Chi Square

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.

Download Full-text

Feature selection for an automated ancient Tamil script classification system using machine learning techniques

2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET) ◽

10.1109/icammaet.2017.8186731 ◽

2017 ◽

Cited By ~ 2

Author(s):

T S Suganya ◽

S Murugavalli

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Classification System ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Selection For ◽

Tamil Script

Download Full-text

Chaotic Harmony Search based Multi-objective Feature Selection for Classification of Gene Expression Profiles

2021 IEEE 9th International Conference on Bioinformatics and Computational Biology (ICBCB) ◽

10.1109/icbcb52223.2021.9459222 ◽

2021 ◽

Author(s):

Aiguo Wang ◽

Huancheng Liu ◽

Guilin Chen

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Expression Profiles ◽

Harmony Search ◽

Gene Expression Profiles ◽

Multi Objective ◽

Selection For

Download Full-text

MACHINE LEARNING TECHNIQUES BASED ON FEATURE SELECTION FOR IMPROVING AUTISM DISEASE CLASSIFICATION

International Journal of Intelligent Computing and Information Sciences ◽

10.21608/ijicis.2021.61582.1058 ◽

2021 ◽

Vol 21 (2) ◽

pp. 65-81

Author(s):

Basma Elshoky ◽

Osman Ibrahim ◽

Abdelmgeid Ali

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Disease Classification ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Selection For

Download Full-text

FSDroid:- A feature selection technique to detect malware from Android using Machine Learning Techniques

Multimedia Tools and Applications ◽

10.1007/s11042-020-10367-w ◽

2021 ◽

Author(s):

Arvind Mahindru ◽

A.L. Sangal

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Machine Learning Techniques ◽

Feature Selection Technique ◽

Selection Technique ◽

Learning Techniques

Download Full-text

Integrative machine learning analysis of multiple gene expression profiles in cervical cancer

PeerJ ◽

10.7717/peerj.5285 ◽

2018 ◽

Vol 6 ◽

pp. e5285 ◽

Cited By ~ 9

Author(s):

Mei Sze Tan ◽

Siow-Wee Chang ◽

Phaik Leng Cheah ◽

Hwa Jen Yap

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Cervical Cancer ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Hpv Infection ◽

Gene Set Enrichment Analysis ◽

Multiple Gene ◽

Cervical Cancers ◽

Learning Analysis

Although most of the cervical cancer cases are reported to be closely related to the Human Papillomavirus (HPV) infection, there is a need to study genes that stand up differentially in the final actualization of cervical cancers following HPV infection. In this study, we proposed an integrative machine learning approach to analyse multiple gene expression profiles in cervical cancer in order to identify a set of genetic markers that are associated with and may eventually aid in the diagnosis or prognosis of cervical cancers. The proposed integrative analysis is composed of three steps: namely, (i) gene expression analysis of individual dataset; (ii) meta-analysis of multiple datasets; and (iii) feature selection and machine learning analysis. As a result, 21 gene expressions were identified through the integrative machine learning analysis which including seven supervised and one unsupervised methods. A functional analysis with GSEA (Gene Set Enrichment Analysis) was performed on the selected 21-gene expression set and showed significant enrichment in a nine-potential gene expression signature, namely PEG3, SPON1, BTD and RPLP2 (upregulated genes) and PRDX3, COPB2, LSM3, SLC5A3 and AS1B (downregulated genes).

Download Full-text

Gene Expression Analysis for Early Lung Cancer Prediction Using Machine Learning Techniques: An Eco-Genomics Approach

IEEE Access ◽

10.1109/access.2018.2886604 ◽

2019 ◽

Vol 7 ◽

pp. 4232-4238 ◽

Cited By ~ 5

Author(s):

Jayadeep Pati

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Lung Cancer ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Machine Learning Techniques ◽

Cancer Prediction ◽

Early Lung Cancer ◽

Learning Techniques

Download Full-text

Peer Review #1 of "Integrative machine learning analysis of multiple gene expression profiles in cervical cancer (v0.1)"

10.7287/peerj.5285v0.1/reviews/1 ◽

2018 ◽

Author(s):

K Wong

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Cervical Cancer ◽

Peer Review ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Multiple Gene ◽

Learning Analysis

Download Full-text

Evolutionary Machine Learning for Classification with Incomplete Data

10.26686/wgtn.17072123 ◽

2021 ◽

Author(s):

◽

Cao Truong Tran

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Genetic Programming ◽

Incomplete Data ◽

Missing Values ◽

Machine Learning Techniques ◽

Feature Construction ◽

Classification Algorithms ◽

Learning Techniques ◽

Effectiveness And Efficiency

<p>Classification is a major task in machine learning and data mining. Many real-world datasets suffer from the unavoidable issue of missing values. Classification with incomplete data has to be carefully handled because inadequate treatment of missing values will cause large classification errors. Existing most researchers working on classification with incomplete data focused on improving the effectiveness, but did not adequately address the issue of the efficiency of applying the classifiers to classify unseen instances, which is much more important than the act of creating classifiers. A common approach to classification with incomplete data is to use imputation methods to replace missing values with plausible values before building classifiers and classifying unseen instances. This approach provides complete data which can be then used by any classification algorithm, but sophisticated imputation methods are usually computationally intensive, especially for the application process of classification. Another approach to classification with incomplete data is to build a classifier that can directly work with missing values. This approach does not require time for estimating missing values, but it often generates inaccurate and complex classifiers when faced with numerous missing values. A recent approach to classification with incomplete data which also avoids estimating missing values is to build a set of classifiers which then is used to select applicable classifiers for classifying unseen instances. However, this approach is also often inaccurate and takes a long time to find applicable classifiers when faced with numerous missing values. The overall goal of the thesis is to simultaneously improve the effectiveness and efficiency of classification with incomplete data by using evolutionary machine learning techniques for feature selection, clustering, ensemble learning, feature construction and constructing classifiers. The thesis develops approaches for improving imputation for classification with incomplete data by integrating clustering and feature selection with imputation. The approaches improve both the effectiveness and the efficiency of using imputation for classification with incomplete data. The thesis develops wrapper-based feature selection methods to improve input space for classification algorithms that are able to work directly with incomplete data. The methods not only improve the classification accuracy, but also reduce the complexity of classifiers able to work directly with incomplete data. The thesis develops a feature construction method to improve input space for classification algorithms with incomplete data by proposing interval genetic programming-genetic programming with a set of interval functions. The method improves the classification accuracy and reduces the complexity of classifiers. The thesis develops an ensemble approach to classification with incomplete data by integrating imputation, feature selection, and ensemble learning. The results show that the approach is more accurate, and faster than previous common methods for classification with incomplete data. The thesis develops interval genetic programming to directly evolve classifiers for incomplete data. The results show that classifiers generated by interval genetic programming can be more effective and efficient than classifiers generated the combination of imputation and traditional genetic programming. Interval genetic programming is also more effective than common classification algorithms able to work directly with incomplete data. In summary, the thesis develops a range of approaches for simultaneously improving the effectiveness and efficiency of classification with incomplete data by using a range of evolutionary machine learning techniques.</p>

Download Full-text

Learning and Feature Selection Using the Set Covering Machine with Data-Dependent Rays on Gene Expression Profiles

Artificial Neural Networks in Pattern Recognition - Lecture Notes in Computer Science ◽

10.1007/11829898_26 ◽

2006 ◽

pp. 286-297

Author(s):

Hans A. Kestler ◽

Wolfgang Lindner ◽

André Müller

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Set Covering

Download Full-text