FEATURE SELECTION USING RANDOM FOREST ALGORITHM TO DIAGNOSE TUBERCULOSIS FROM LUNG CT IMAGES

Tuberculosis is one of the hazardous infectious diseases that can be categorized by the evolution of tubercles in the tissues. This disease mainly affects the lungs and also the other parts of the body. The disease can be easily diagnosed by the radiologists. The main objective of this chapter is to get best solution selected by means of modified particle swarm optimization is regarded as optimal feature descriptor. Five stages are being used to detect tuberculosis disease. They are pre-processing an image, segmenting the lungs and extracting the feature, feature selection and classification. These stages that are used in medical image processing to identify the tuberculosis. In the feature extraction, the GLCM approach is used to extract the features and from the extracted feature sets the optimal features are selected by random forest. Finally, support vector machine classifier method is used for image classification. The experimentation is done, and intermediate results are obtained. The proposed system accuracy results are better than the existing method in classification.

Download Full-text

Prediction of Heart Disease Using Random Forest and Rough Set Based Feature Selection

International Journal of Big Data and Analytics in Healthcare ◽

10.4018/ijbdah.2018010101 ◽

2018 ◽

Vol 3 (1) ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Indu Yekkala ◽

Sunanda Dixit

Keyword(s):

Feature Selection ◽

Heart Disease ◽

Random Forest ◽

Causes Of Death ◽

Electronic Records ◽

Complex Nature ◽

Random Forest Algorithm ◽

Multiple Sources ◽

Artery Disease ◽

Single Data

Data is generated by the medical industry. Often this data is of very complex nature—electronic records, handwritten scripts, etc.—since it is generated from multiple sources. Due to the Complexity and sheer volume of this data necessitates techniques that can extract insight from this data in a quick and efficient way. These insights not only diagnose the diseases but also predict and can prevent disease. One such use of these techniques is cardiovascular diseases. Heart disease or coronary artery disease (CAD) is one of the major causes of death all over the world. Comprehensive research using single data mining techniques have not resulted in an acceptable accuracy. Further research is being carried out on the effectiveness of hybridizing more than one technique for increasing accuracy in the diagnosis of heart disease. In this article, the authors worked on heart stalog dataset collected from the UCI repository, used the Random Forest algorithm and Feature Selection using rough sets to accurately predict the occurrence of heart disease

Download Full-text

Software Defect Prediction using Feature Selection and Random Forest Algorithm

2017 International Conference on New Trends in Computing Sciences (ICTCS) ◽

10.1109/ictcs.2017.39 ◽

2017 ◽

Cited By ~ 8

Author(s):

Dyana Rashid Ibrahim ◽

Rawan Ghnemat ◽

Amjad Hudaib

Keyword(s):

Feature Selection ◽

Random Forest ◽

Defect Prediction ◽

Software Defect Prediction ◽

Random Forest Algorithm ◽

Software Defect

Download Full-text

Virtual Screening for COX-2 Inhibitors with Random Forest Algorithm and Feature Selection

Proceedings of the International Conference on Bioinformatics Research and Applications 2017 - ICBRA 2017 ◽

10.1145/3175587.3175594 ◽

2017 ◽

Author(s):

Shangjie Ai ◽

Yong Bai ◽

Xiande Liu

Keyword(s):

Feature Selection ◽

Random Forest ◽

Virtual Screening ◽

Random Forest Algorithm ◽

Cox 2 ◽

Cox 2 Inhibitors

Download Full-text

Phishing Detection Based on Machine Learning and Feature Selection Methods

International Journal of Interactive Mobile Technologies (iJIM) ◽

10.3991/ijim.v13i12.11411 ◽

2019 ◽

Vol 13 (12) ◽

pp. 171 ◽

Cited By ~ 1

Author(s):

Mohammad Almseidin ◽

AlMaha Abu Zuraiq ◽

Mouhammd Al-kasassbeh ◽

Nidal Alnidami

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Web Pages ◽

Selection Methods ◽

Random Forest Algorithm ◽

Phishing Detection ◽

Enormous Number

With increasing technology developments, the Internet has become everywhere and accessible by everyone. There are a considerable number of web-pages with different benefits. Despite this enormous number, not all of these sites are legitimate. There are so-called phishing sites that deceive users into serving their interests. This paper dealt with this problem using machine learning algorithms in addition to employing a novel dataset that related to phishing detection, which contains 5000 legitimate web-pages and 5000 phishing ones. In order to obtain the best results, various machine learning algorithms were tested. Then J48, Random forest, and Multilayer perceptron were chosen. Different feature selection tools were employed to the dataset in order to improve the efficiency of the models. The best result of the experiment achieved by utilizing 20 features out of 48 features and applying it to Random forest algorithm. The accuracy was 98.11%.

Download Full-text

Computational Method for Identifying Malonylation Sites by Using Random Forest Algorithm

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666181227144318 ◽

2020 ◽

Vol 23 (4) ◽

pp. 304-312

Author(s):

ShaoPeng Wang ◽

JiaRui Li ◽

Xijun Sun ◽

Yu-Hang Zhang ◽

Tao Huang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Selection Procedure ◽

Machine Learning Algorithms ◽

Computational Method ◽

Feature Subset ◽

Random Forest Algorithm ◽

Post Translational Modification

Background: As a newly uncovered post-translational modification on the ε-amino group of lysine residue, protein malonylation was found to be involved in metabolic pathways and certain diseases. Apart from experimental approaches, several computational methods based on machine learning algorithms were recently proposed to predict malonylation sites. However, previous methods failed to address imbalanced data sizes between positive and negative samples. Objective: In this study, we identified the significant features of malonylation sites in a novel computational method which applied machine learning algorithms and balanced data sizes by applying synthetic minority over-sampling technique. Method: Four types of features, namely, amino acid (AA) composition, position-specific scoring matrix (PSSM), AA factor, and disorder were used to encode residues in protein segments. Then, a two-step feature selection procedure including maximum relevance minimum redundancy and incremental feature selection, together with random forest algorithm, was performed on the constructed hybrid feature vector. Results: An optimal classifier was built from the optimal feature subset, which featured an F1-measure of 0.356. Feature analysis was performed on several selected important features. Conclusion: Results showed that certain types of PSSM and disorder features may be closely associated with malonylation of lysine residues. Our study contributes to the development of computational approaches for predicting malonyllysine and provides insights into molecular mechanism of malonylation.

Download Full-text

A novel hybrid method for determining the depth of anesthesia level: Combining ReliefF feature selection and random forest algorithm (ReliefF+RF)

2015 International Symposium on Innovations in Intelligent SysTems and Applications (INISTA) ◽

10.1109/inista.2015.7276737 ◽

2015 ◽

Cited By ~ 8

Author(s):

Musa Peker ◽

Ayse Arslan ◽

Baha Sen ◽

Fatih V. Celebi ◽

Abdulkadir But

Keyword(s):

Feature Selection ◽

Random Forest ◽

Hybrid Method ◽

Depth Of Anesthesia ◽

Random Forest Algorithm ◽

Anesthesia Level

Download Full-text

A Study of the Classification of Motor Imagery Signals using Machine Learning Tools

10.5121/csit.2021.112104 ◽

2021 ◽

Author(s):

Anam Hashmi ◽

Bilal Alam Khan ◽

Omar Farooq

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wavelet Transform ◽

Random Forest ◽

Random Forest Algorithm ◽

Eeg Signals ◽

Relaxation State ◽

Wavelet Transform Analysis ◽

Imagined Movement

In this paper, we propose a system for the purpose of classifying Electroencephalography (EEG) signals associated with imagined movement of right hand and relaxation state using machine learning algorithm namely Random Forest Algorithm. The EEG dataset used in this research was created by the University of Tubingen, Germany. EEG signals associated with the imagined movement of right hand and relaxation state were processed using wavelet transform analysis with Daubechies orthogonal wavelet as the mother wavelet. After the wavelet transform analysis, eight features were extracted. Subsequently, a feature selection method based on Random Forest Algorithm was employed giving us the best features out of the eight proposed features. The feature selection stage was followed by classification stage in which eight different models combining the different features based on their importance were constructed. The optimum classification performance of 85.41% was achieved with the Random Forest classifier. This research shows that this system of classification of motor movements can be used in a Brain Computer Interface system (BCI) to mentally control a robotic device or an exoskeleton.

Download Full-text

Modify Random Forest Algorithm Using Hybrid Feature Selection Method

International Journal on Perceptive and Cognitive Computing ◽

10.31436/ijpcc.v4i2.59 ◽

2018 ◽

Vol 4 (2) ◽

pp. 1-6

Author(s):

Ahmed T. Sadiqâ€Ž ◽

Karrar Shareef Musawi

Keyword(s):

Feature Selection ◽

Random Forest ◽

Gini Index ◽

Feature Selection Method ◽

Selection Method ◽

Random Selection ◽

Experimental Results ◽

Random Forest Algorithm ◽

Selection For

The Importance of Random Forrest(RF) is one of the most powerful â€Žmethods â€Žof â€Žmachine learning in â€ŽDecision Tree.â€Ž The Proposed hybrid feature selection for Random Forest depend on â€Žtwo â€Žmeasure â€Žâ€ŽInformation Gain and Gini Index in varying percentages â€Žbased on â€Žweight.â€Ž In this paper, we tend to â€Žpropose a modify Random Forrestâ€ â€â€Žalgorithm named â€ŽRandom Forest algorithm using hybrid â€Žfeature â€Žâ€Žselection â€Žthat uses hybrid feature â€Žselection instead of â€Žusing â€Žone feature selection. The â€Žmain plan is to â€Žcomputation the â€Žâ€Ž Information â€ŽGain for all random selection â€Žfeature then search for â€Žthe best split â€Žâ€Žpoint in â€Žthe node that gives the best â€Žvalue for a hybrid â€Žequation with â€ŽGini Index. â€ŽThe experimental results on the â€Ždataset â€Žshowed that the proposed â€Žmodification is â€Žbetter than the classic Random â€ŽForest compared to â€Žthe standard static Random â€ŽForest the hybrid feature â€Žâ€Žselection Random Forrest shows significant â€Žimprovement â€Žin accuracy measure.â€Ž

Download Full-text