A Study of the Classification of Motor Imagery Signals using Machine Learning Tools

Mapping Intimacies ◽

10.5121/csit.2021.112104 ◽

2021 ◽

Author(s):

Anam Hashmi ◽

Bilal Alam Khan ◽

Omar Farooq

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Wavelet Transform ◽

Random Forest ◽

Random Forest Algorithm ◽

Eeg Signals ◽

Relaxation State ◽

Wavelet Transform Analysis ◽

Imagined Movement

In this paper, we propose a system for the purpose of classifying Electroencephalography (EEG) signals associated with imagined movement of right hand and relaxation state using machine learning algorithm namely Random Forest Algorithm. The EEG dataset used in this research was created by the University of Tubingen, Germany. EEG signals associated with the imagined movement of right hand and relaxation state were processed using wavelet transform analysis with Daubechies orthogonal wavelet as the mother wavelet. After the wavelet transform analysis, eight features were extracted. Subsequently, a feature selection method based on Random Forest Algorithm was employed giving us the best features out of the eight proposed features. The feature selection stage was followed by classification stage in which eight different models combining the different features based on their importance were constructed. The optimum classification performance of 85.41% was achieved with the Random Forest classifier. This research shows that this system of classification of motor movements can be used in a Brain Computer Interface system (BCI) to mentally control a robotic device or an exoskeleton.

Download Full-text

Classification of Diabetes using Random Forest with Feature Selection Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3595.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1295-1300 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Electronic Health Records ◽

Learning Algorithm ◽

Machine Learning Algorithm ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Health Records

Diabetes has become a serious problem now a day. So there is a need to take serious precautions to eradicate this. To eradicate, we should know the level of occurrence. In this project we predict the level of occurrence of diabetes. We predict the level of occurrence of diabetes using Random Forest, a Machine Learning Algorithm. Using the patient’s Electronic Health Records (EHR) we can build accurate models that predict the presence of diabetes.

Download Full-text

Phishing Detection Based on Machine Learning and Feature Selection Methods

International Journal of Interactive Mobile Technologies (iJIM) ◽

10.3991/ijim.v13i12.11411 ◽

2019 ◽

Vol 13 (12) ◽

pp. 171 ◽

Cited By ~ 1

Author(s):

Mohammad Almseidin ◽

AlMaha Abu Zuraiq ◽

Mouhammd Al-kasassbeh ◽

Nidal Alnidami

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Web Pages ◽

Selection Methods ◽

Random Forest Algorithm ◽

Phishing Detection ◽

Enormous Number

With increasing technology developments, the Internet has become everywhere and accessible by everyone. There are a considerable number of web-pages with different benefits. Despite this enormous number, not all of these sites are legitimate. There are so-called phishing sites that deceive users into serving their interests. This paper dealt with this problem using machine learning algorithms in addition to employing a novel dataset that related to phishing detection, which contains 5000 legitimate web-pages and 5000 phishing ones. In order to obtain the best results, various machine learning algorithms were tested. Then J48, Random forest, and Multilayer perceptron were chosen. Different feature selection tools were employed to the dataset in order to improve the efficiency of the models. The best result of the experiment achieved by utilizing 20 features out of 48 features and applying it to Random forest algorithm. The accuracy was 98.11%.

Download Full-text

Computational Method for Identifying Malonylation Sites by Using Random Forest Algorithm

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666181227144318 ◽

2020 ◽

Vol 23 (4) ◽

pp. 304-312

Author(s):

ShaoPeng Wang ◽

JiaRui Li ◽

Xijun Sun ◽

Yu-Hang Zhang ◽

Tao Huang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Learning Algorithms ◽

Selection Procedure ◽

Machine Learning Algorithms ◽

Computational Method ◽

Feature Subset ◽

Random Forest Algorithm ◽

Post Translational Modification

Background: As a newly uncovered post-translational modification on the ε-amino group of lysine residue, protein malonylation was found to be involved in metabolic pathways and certain diseases. Apart from experimental approaches, several computational methods based on machine learning algorithms were recently proposed to predict malonylation sites. However, previous methods failed to address imbalanced data sizes between positive and negative samples. Objective: In this study, we identified the significant features of malonylation sites in a novel computational method which applied machine learning algorithms and balanced data sizes by applying synthetic minority over-sampling technique. Method: Four types of features, namely, amino acid (AA) composition, position-specific scoring matrix (PSSM), AA factor, and disorder were used to encode residues in protein segments. Then, a two-step feature selection procedure including maximum relevance minimum redundancy and incremental feature selection, together with random forest algorithm, was performed on the constructed hybrid feature vector. Results: An optimal classifier was built from the optimal feature subset, which featured an F1-measure of 0.356. Feature analysis was performed on several selected important features. Conclusion: Results showed that certain types of PSSM and disorder features may be closely associated with malonylation of lysine residues. Our study contributes to the development of computational approaches for predicting malonyllysine and provides insights into molecular mechanism of malonylation.

Download Full-text

Classification of iron oxide aerosols by a single particle soot photometer using supervised machine learning

Atmospheric Measurement Techniques ◽

10.5194/amt-12-3885-2019 ◽

2019 ◽

Vol 12 (7) ◽

pp. 3885-3906 ◽

Cited By ~ 2

Author(s):

Kara D. Lamb

Keyword(s):

Machine Learning ◽

Random Forest ◽

Test Data ◽

Single Particle ◽

Broad Band ◽

Supervised Machine Learning ◽

Data Sets ◽

Specific Class ◽

Random Forest Algorithm

Abstract. Single particle soot photometers (SP2) use laser-induced incandescence to detect aerosols on a single particle basis. SP2s that have been modified to provide greater spectral contrast between their narrow and broad-band incandescent detectors have previously been used to characterize both refractory black carbon (rBC) and light-absorbing metallic aerosols, including iron oxides (FeOx). However, single particles cannot be unambiguously identified from their incandescent peak height (a function of particle mass) and color ratio (a measure of blackbody temperature) alone. Machine learning offers a promising approach for improving the classification of these aerosols. Here we explore the advantages and limitations of classifying single particle signals obtained with a modified SP2 using a supervised machine learning algorithm. Laboratory samples of different aerosols that incandesce in the SP2 (fullerene soot, mineral dust, volcanic ash, coal fly ash, Fe2O3, and Fe3O4) were used to train a random forest algorithm. The trained algorithm was then applied to test data sets of laboratory samples and atmospheric aerosols. This method provides a systematic approach for classifying incandescent aerosols by providing a score, or conditional probability, that a particle is likely to belong to a particular aerosol class (rBC, FeOx, etc.) given its observed single particle features. We consider two alternative approaches for identifying aerosols in mixed populations based on their single particle SP2 response: one with specific class labels for each species sampled, and one with three broader classes (rBC, anthropogenic FeOx, and dust-like) for particles with similar SP2 responses. Predictions of the most likely particle class (the one with the highest mean probability) based on applying the trained random forest algorithm to the single particle features for test data sets comprising examples of each class are compared with the true class for those particles to estimate generalization performance. While the specific class approach performed well for rBC and Fe3O4 (≥99 % of these aerosols are correctly identified), its classification of other aerosol types is significantly worse (only 47 %–66 % of other particles are correctly identified). Using the broader class approach, we find a classification accuracy of 99 % for FeOx samples measured in the laboratory. The method allows for classification of FeOx as anthropogenic or dust-like for aerosols with effective spherical diameters from 170 to >1200 nm. The misidentification of both dust-like aerosols and rBC as anthropogenic FeOx is small, with <3 % of the dust-like aerosols and <0.1 % of rBC misidentified as FeOx for the broader class case. When applying this method to atmospheric observations taken in Boulder, CO, a clear mode consistent with FeOx was observed, distinct from dust-like aerosols.

Download Full-text

Wavelet Based Machine Learning Technique to Classify the Different Shoulder Movement of Upper Limb Amputee

Journal of Biomimetics Biomaterials and Biomedical Engineering ◽

10.4028/www.scientific.net/jbbbe.31.32 ◽

2017 ◽

Vol 31 ◽

pp. 32-43 ◽

Cited By ~ 2

Author(s):

Amanpreet Kaur ◽

Amod Kumar ◽

Ravinder Agarwal

Keyword(s):

Machine Learning ◽

Wavelet Transform ◽

Random Forest ◽

Upper Limb ◽

Learning Algorithm ◽

Discrete Wavelet ◽

Semg Signal ◽

Shoulder Movement ◽

Fundamental Information

The wavelet transform is an accurate, efficient and efficacious method to improve the quality of the myoelectric signal. Classification of the signal from upper limb using Surface Electromyogram (SEMG) signal has been the matter of extensive research. Number of methods and algorithms have been described by researchers to classify biomedical signals. The main aim of this paper to extract the different coefficient values from the given SEMG data by using Discrete Wavelet Transform (DWT). Afterward, random forest machine learning algorithm was used to identify the different shoulder movement of an upper limb amputee. The combination of wavelet coefficients and random forest exhibited the best performance with 99.2% accuracy for the classification of different shoulder motions. It was found that the different motion can be identified accurately and provide the fundamental information to develop an efficient prosthetic device.

Download Full-text

ANALYSIS OF SINGLE AND ENSEMBLE MACHINE LEARNING CLASSIFIERS FOR PHISHING ATTACKS DETECTION

International Journal of Computer Systems & Software Engineering ◽

10.15282/ijsecs.7.2.2021.5.0088 ◽

2021 ◽

Vol 7 (2) ◽

pp. 44-49

Author(s):

Oyelakin A. M ◽

Alimi O. M ◽

Mustapha I. O ◽

Ajiboye I. K

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Decision Trees ◽

Random Forest Algorithm ◽

Ensemble Techniques ◽

Learning Classifiers ◽

Phishing Attacks ◽

Ensemble Machine Learning

Phishing attacks have been used in different ways to harvest the confidential information of unsuspecting internet users. To stem the tide of phishing-based attacks, several machine learning techniques have been proposed in the past. However, fewer studies have considered investigating single and ensemble machine learning-based models for the classification of phishing attacks. This study carried out performance analysis of selected single and ensemble machine learning (ML) classifiers in phishing classification.The focus is to investigate how these algorithms behave in the classification of phishing attacks in the chosen dataset. Logistic Regression and Decision Trees were chosen as single learning classifiers while simple voting techniques and Random Forest were used as the ensemble machine learning algorithms. Accuracy, Precision, Recall and F1-score were used as performance metrics. Logistic Regression algorithm recorded 0.86 as accuracy, 0.89 as precision, 0.87 as recall and 0.81 as F1-score. Similarly, the Decision Trees classifier achieved an accuracy of 0.87, 0.83 for precision, 0.88 for recall and 0.81 for F1-score. In the voting ensemble, accuracy of 0.92 was achieved. 0.90 was obtained for precision, 0.92 for recall and 0.92 for F1-score. Random Forest algorithm recorded 0.98, 0.97, 0.98 and 0.97 as accuracy, precision, recall and F1-score respectively. From the experimental analyses, Random Forest algorithm outperformed simple averaging classifier and the two single algorithms used for phishing url detection. The study established that the ensemble techniques that were used for the experimentations are more efficient for phishing url identification compared to the single classifiers.

Download Full-text

Comparations of Supervised Machine Learning Techniques in Predicting the Classification of the Household’s Welfare Status

Journal Pekommas ◽

10.30818/jpkm.2019.2040105 ◽

2019 ◽

Vol 4 (1) ◽

pp. 43

Author(s):

Nfn Nofriani

Keyword(s):

Machine Learning ◽

Random Forest ◽

Social Assistance ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Random Forest Algorithm ◽

K Nearest Neighbor ◽

Learning Techniques

Poverty has been a major problem for most countries around the world, including Indonesia. One approach to eradicate poverty is through equitable distribution of social assistance for target households based on Integrated Database of social assistance. This study has compared several well-known supervised machine learning techniques, namely: Naïve Bayes Classifier, Support Vector Machines, K-Nearest Neighbor Classification, C4.5 Algorithm, and Random Forest Algorithm to predict household welfare status classification by using an Integrated Database as a study case. The main objective of this study was to choose the best-supervised machine learning approach in predicting the classification of household’s welfare status based on attributes in the Integrated Database. The results showed that the Random Forest Algorithm was the best.

Download Full-text

Random Forest Feature Selection, Fusion and Ensemble Strategy: Combining Multiple Morphological MRI Measures to Discriminate among healthy elderly, MCI, cMCI and Alzheimer's disease patients: from the Alzheimer’s disease neuroimaging initiative (ADNI) database

10.1101/236141 ◽

2017 ◽

Author(s):

S.I. Dimitriadis ◽

D. Liparas ◽

Magda N. Tsolaki

Keyword(s):

Machine Learning ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Feature Selection ◽

Random Forest ◽

Validation Dataset ◽

Learning Scheme ◽

Mri Features ◽

Ensemble Strategy

AbstractBackgroundIn the era of computer-assisted diagnostic tools for various brain diseases, Alzheimer’s disease (AD) covers a large percentage of neuroimaging research, with the main scope being its use in daily practice. However, there has been no study attempting to simultaneously discriminate among Healthy Controls (HC), early mild cognitive impairment (MCI), late MCI (cMCI) and stable AD, using features derived from a single modality, namely MRI.New MethodBased on preprocessed MRI images from the organizers of a neuroimaging challenge2, we attempted to quantify the prediction accuracy of multiple morphological MRI features to simultaneously discriminate among HC, MCI, cMCI and AD. We explored the efficacy of a novel scheme that includes multiple feature selections via Random Forest from subsets of the whole set of features (e.g. whole set, left/right hemisphere etc.), Random Forest classification using a fusion approach and ensemble classification via majority voting.From the ADNI database, 60 HC, 60 MCI, 60 cMCI and 60 AD were used as a training set with known labels. An extra dataset of 160 subjects (HC: 40, MCI: 40, cMCI: 40 and AD: 40) was used as an external blind validation dataset to evaluate the proposed machine learning scheme.ResultsIn the second blind dataset, we succeeded in a four-class classification of 61.9% by combining MRI-based features with a Random Forest-based Ensemble Strategy. We achieved the best classification accuracy of all teams that participated in this neuroimaging competition.Comparison with Existing Method(s)The results demonstrate the effectiveness of the proposed scheme to simultaneously discriminate among four groups using morphological MRI features for the very first time in the literature.ConclusionsHence, the proposed machine learning scheme can be used to define single and multi-modal biomarkers for AD.HIGHLIGHTS1st place in International Challenge for Automated Prediction of MCI from MRI DataMulti-class classification of normal control, MCI, converting MCI, and Alzheimer’s diseaseMorphometric measures from 3D T1 brain MRI images have been analysed (ADNI1 cohort).A Random Forest Feature Selection, Fusion and Ensemble Strategy was applied to classification and prediction of AD.Accuracy and robustness have been assessed in a blind dataset

Download Full-text

MetalExplorer, a Bioinformatics Tool for the Improved Prediction of Eight Types of Metal-Binding Sites Using a Random Forest Algorithm with Two- Step Feature Selection

Current Bioinformatics ◽

10.2174/2468422806666160618091522 ◽

2017 ◽

Vol 12 (6) ◽

Cited By ~ 6

Author(s):

Jiangning Song ◽

Chen Li ◽

Cheng Zheng ◽

Jerico Revote ◽

Ziding Zhang ◽

...

Keyword(s):

Feature Selection ◽

Random Forest ◽

Metal Binding ◽

Binding Sites ◽

Random Forest Algorithm ◽

Bioinformatics Tool ◽

Metal Binding Sites

Download Full-text

Classification of Brainwaves for Sleep Stages by High-Dimensional FFT Features from EEG Signals

Applied Sciences ◽

10.3390/app10051797 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1797 ◽

Cited By ~ 2

Author(s):

Mera Kartika Delimayanti ◽

Bedy Purnama ◽

Ngoc Giang Nguyen ◽

Mohammad Reza Faisal ◽

Kunti Robiatul Mahmudah ◽

...

Keyword(s):

Machine Learning ◽

Sleep Stage ◽

Machine Learning Algorithms ◽

High Dimensional ◽

Sleep Stages ◽

Eeg Signals ◽

Stage Classification ◽

Sleep Stage Classification ◽

Low Dimensional

Manual classification of sleep stage is a time-consuming but necessary step in the diagnosis and treatment of sleep disorders, and its automation has been an area of active study. The previous works have shown that low dimensional fast Fourier transform (FFT) features and many machine learning algorithms have been applied. In this paper, we demonstrate utilization of features extracted from EEG signals via FFT to improve the performance of automated sleep stage classification through machine learning methods. Unlike previous works using FFT, we incorporated thousands of FFT features in order to classify the sleep stages into 2–6 classes. Using the expanded version of Sleep-EDF dataset with 61 recordings, our method outperformed other state-of-the art methods. This result indicates that high dimensional FFT features in combination with a simple feature selection is effective for the improvement of automated sleep stage classification.

Download Full-text