RANDOM FOREST BASED CLASSIFICATION OF MEDICAL X-RAY IMAGES USING A GENETIC ALGORITHM FOR FEATURE SELECTION

2015 ◽  
Vol 15 (02) ◽  
pp. 1540025 ◽  
Author(s):  
IMANE NEDJAR ◽  
MOSTAFA EL HABIB DAHO ◽  
NESMA SETTOUTI ◽  
SAÏD MAHMOUDI ◽  
MOHAMED AMINE CHIKH

Automated classification of medical images is an increasingly important tool for physicians in their daily activities. However, due to its computational complexity, this task is one of the major current challenges in the field of content-based image retrieval (CBIR). In this paper, a medical image classification approach is proposed. This method is composed of two main phases. The first step consists of a pre-processing, where a texture and shape based features vector is extracted. Also, a feature selection approach was applied by using a Genetic Algorithm (GA). The proposed GA uses a kNN based classification error as fitness function, which enables the GA to obtain a combinatorial set of feature giving rise to optimal accuracy. In the second phase, a classification process is achieved by using random Forest classifier and a supervised multi-class classifier based on the support vector machine (SVM) for classifying X-ray images.

Author(s):  
Ahmed Abdullah Farid ◽  
Gamal Selim ◽  
Hatem Khater

Breast cancer is a significant health issue across the world. Breast cancer is the most widely-diagnosed cancer in women; early-stage diagnosis of disease and therapies increase patient safety. This paper proposes a synthetic model set of features focused on the optimization of the genetic algorithm (CHFS-BOGA) to forecast breast cancer. This hybrid feature selection approach combines the advantages of three filter feature selection approaches with an optimize Genetic Algorithm (OGA) to select the best features to improve the performance of the classification process and scalability. We propose OGA by improving the initial population generating and genetic operators using the results of filter approaches as some prior information with using the C4.5 decision tree classifier as a fitness function instead of probability and random selection. The authors collected available updated data from Wisconsin UCI machine learning with a total of 569 rows and 32 columns. The dataset evaluated using an explorer set of weka data mining open-source software for the analysis purpose. The results show that the proposed hybrid feature selection approach significantly outperforms the single filter approaches and principal component analysis (PCA) for optimum feature selection. These characteristics are good indicators for the return prediction. The highest accuracy achieved with the proposed system before (CHFS-BOGA) using the support vector machine (SVM) classifiers was 97.3%. The highest accuracy after (CHFS-BOGA-SVM) was 98.25% on split 70.0% train, remainder test, and 100% on the full training set. Moreover, the receiver operating characteristic (ROC) curve was equal to 1.0. The results showed that the proposed (CHFS-BOGA-SVM) system was able to accurately classify the type of breast tumor, whether malignant or benign.


2017 ◽  
Vol 2017 ◽  
pp. 1-14 ◽  
Author(s):  
Wenbo Pang ◽  
Huiyan Jiang ◽  
Siqi Li

Accurate classification of hepatocellular carcinoma (HCC) image is of great importance in pathology diagnosis and treatment. This paper proposes a concave-convex variation (CCV) method to optimize three classifiers (random forest, support vector machine, and extreme learning machine) for the more accurate HCC image classification results. First, in preprocessing stage, hematoxylin-eosin (H&E) pathological images are enhanced using bilateral filter and each HCC image patch is obtained under the guidance of pathologists. Then, after extracting the complete features of each patch, a new sparse contribution (SC) feature selection model is established to select the beneficial features for each classifier. Finally, a concave-convex variation method is developed to improve the performance of classifiers. Experiments using 1260 HCC image patches demonstrate that our proposed CCV classifiers have improved greatly compared to each original classifier and CCV-random forest (CCV-RF) performs the best for HCC image recognition.


2008 ◽  
Vol 20 (06) ◽  
pp. 345-352
Author(s):  
Li-Yeh Chuang ◽  
Cheng-San Yang ◽  
Jung-Chike Li ◽  
Cheng-Hong Yang

Microarray data can provide valuable results for a variety of gene expression profile problems and contribute to advances in clinical medicine. The application of microarray data on cancer-type classification has recently gained in popularity. The properties of microarray data contain a large number of features (genes) with high dimensions, and one in the multi-class category. These facts make testing and training of general classification methods difficult. Reducing the number of genes and achieving lower classification error rates are the main issues to be solved. The classification of microarray data samples can be regarded as a feature selection and classifier design problem. The goal of feature selection is to select those subsets of differentially expressed genes that are potentially relevant for distinguishing the sample classes. Classical genetic algorithms (GAs) may suffer from premature convergence and thus lead to poor experimental results. In this paper, combat genetic algorithm (CGA) is used to implement the feature selection, and a K-nearest neighbor with the leave-one-out cross-validation method serves as a classifier of the CGA fitness function for the classification problem. The proposed method was applied to 10 microarray data sets that were obtained from the literature. The experimental results show that the proposed method not only effectively reduced the number of gene expression levels but also achieved lower classification error rates.


2014 ◽  
Vol 666 ◽  
pp. 267-271 ◽  
Author(s):  
W.K Wong ◽  
Muralindran Mariappan ◽  
Ali Chekima ◽  
Manimehala Nadarajan ◽  
Brendan Khoo

This research is a part of a larger research scope to recognise individual weed species for weed scouting and spot weeding. Support Vector Machines are used to classify the presence of specified weeds(Amaranthus palmeri )by analysing the shape of the weeds. Weed leaves are extracted using image dilation and erosion methods. Several shape feature types were proposed and a total of 59 features were used as the feature pool. The feature selection and fine tuning of the Support Vector Machine are performed using Genetic Algorithm. The outcome is a generalised classifier that enables classification of weed leaves with an average of 90.5% classification rate.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hamideh Soltani ◽  
Zahra Einalou ◽  
Mehrdad Dadgostar ◽  
Keivan Maghooli

AbstractBrain computer interface (BCI) systems have been regarded as a new way of communication for humans. In this research, common methods such as wavelet transform are applied in order to extract features. However, genetic algorithm (GA), as an evolutionary method, is used to select features. Finally, classification was done using the two approaches support vector machine (SVM) and Bayesian method. Five features were selected and the accuracy of Bayesian classification was measured to be 80% with dimension reduction. Ultimately, the classification accuracy reached 90.4% using SVM classifier. The results of the study indicate a better feature selection and the effective dimension reduction of these features, as well as a higher percentage of classification accuracy in comparison with other studies.


Author(s):  
VLADIMIR NIKULIN ◽  
TIAN-HSIANG HUANG ◽  
GEOFFREY J. MCLACHLAN

The method presented in this paper is novel as a natural combination of two mutually dependent steps. Feature selection is a key element (first step) in our classification system, which was employed during the 2010 International RSCTC data mining (bioinformatics) Challenge. The second step may be implemented using any suitable classifier such as linear regression, support vector machine or neural networks. We conducted leave-one-out (LOO) experiments with several feature selection techniques and classifiers. Based on the LOO evaluations, we decided to use feature selection with the separation type Wilcoxon-based criterion for all final submissions. The method presented in this paper was tested successfully during the RSCTC data mining Challenge, where we achieved the top score in the Basic track.


Sign in / Sign up

Export Citation Format

Share Document