ensemble classifiers
Recently Published Documents


TOTAL DOCUMENTS

460
(FIVE YEARS 181)

H-INDEX

25
(FIVE YEARS 6)

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 644
Author(s):  
Hanqing Wang ◽  
Xiaoyuan Wang ◽  
Junyan Han ◽  
Hui Xiang ◽  
Hao Li ◽  
...  

Aggressive driving behavior (ADB) is one of the main causes of traffic accidents. The accurate recognition of ADB is the premise to timely and effectively conduct warning or intervention to the driver. There are some disadvantages, such as high miss rate and low accuracy, in the previous data-driven recognition methods of ADB, which are caused by the problems such as the improper processing of the dataset with imbalanced class distribution and one single classifier utilized. Aiming to deal with these disadvantages, an ensemble learning-based recognition method of ADB is proposed in this paper. First, the majority class in the dataset is grouped employing the self-organizing map (SOM) and then are combined with the minority class to construct multiple class balance datasets. Second, three deep learning methods, including convolutional neural networks (CNN), long short-term memory (LSTM), and gated recurrent unit (GRU), are employed to build the base classifiers for the class balance datasets. Finally, the ensemble classifiers are combined by the base classifiers according to 10 different rules, and then trained and verified using a multi-source naturalistic driving dataset acquired by the integrated experiment vehicle. The results suggest that in terms of the recognition of ADB, the ensemble learning method proposed in this research achieves better performance in accuracy, recall, and F1-score than the aforementioned typical deep learning methods. Among the ensemble classifiers, the one based on the LSTM and the Product Rule has the optimal performance, and the other one based on the LSTM and the Sum Rule has the suboptimal performance.


2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Abdullah Al-Hashedi ◽  
Belal Al-Fuhaidi ◽  
Abdulqader M. Mohsen ◽  
Yousef Ali ◽  
Hasan Ali Gamal Al-Kaf ◽  
...  

Sentiment analysis has recently become increasingly important with a massive increase in online content. It is associated with the analysis of textual data generated by social media that can be easily accessed, obtained, and analyzed. With the emergence of COVID-19, most published studies related to COVID-19’s conspiracy theories were surveys on the people's sentiments and opinions and studied the impact of the pandemic on their lives. Just a few studies utilized sentiment analysis of social media using a machine learning approach. These studies focused more on sentiment analysis of Twitter tweets in the English language and did not pay more attention to other languages such as Arabic. This study proposes a machine learning model to analyze the Arabic tweets from Twitter. In this model, we apply Word2Vec for word embedding which formed the main source of features. Two pretrained continuous bag-of-words (CBOW) models are investigated, and Naïve Bayes was used as a baseline classifier. Several single-based and ensemble-based machine learning classifiers have been used with and without SMOTE (synthetic minority oversampling technique). The experimental results show that applying word embedding with an ensemble and SMOTE achieved good improvement on average of F1 score compared to the baseline classifier and other classifiers (single-based and ensemble-based) without SMOTE.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012003
Author(s):  
Rajat Jain ◽  
Pranam R Betrabet ◽  
B Ashwath Rao ◽  
N V Subba Reddy

Abstract Arrhythmia is one of the life-threatening heart diseases which is diagnosed and analyzed using electrocardiogram (ECG) recordings and other symptoms namely rapid heartbeat or chest-pounding, shortness of breath, near fainting spells, insufficient pumping of blood from the heart, etc along with sudden cardiac arrest. Arrhythmia records a hasty and aberrant ECG. In this implementation, the arrhythmia dataset is collected from the UCI machine learning repository and then classified the records into sixteen stated classes using multiclass classification. The large feature set of the dataset is reduced using improved feature selection techniques such as t-Distributed Stochastic Neighbor Embedding (TSNE), Principal Component Analysis (PCA), Uniform Manifold Approximation, and Projection (UMAP) and then an Ensemble Classifier is built to analyse the classification accuracy on arrhythmia dataset to conclude when and which approach gives optimal results.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012007
Author(s):  
G Ashwin Shanbhag ◽  
K Anurag Prabhu ◽  
N V Subba Reddy ◽  
B Ashwath Rao

Abstract Carcinoma detection from CT scan images is extremely necessary for numerous diagnostic and healing applications. Because of the excessive amount of information in CT scan images and blurred boundaries, tumor segmentation and class are extremely laborious. The intention is to categorize carcinoma into benign and malignant categories. In MR pictures, the number of facts is a lot for interpreting and evaluating manually. Over the previous few years, carcinoma detection in CT has grown to be a rising evaluation space in the area of the scientific imaging system. Correct detection of length and site of lung cancer performs a vital position in the designation of carcinoma. In this paper, we introduce a novel carcinoma detection methodology that helps in predicting the carcinoma from the CT scanned images. The methodology has 4 different stages, pre-processing the image data, segmentation, extracting features, and classification stage to categorize the benign and malignant. This work makes use of extraordinary models for detecting carcinoma in a CT test via way of means of constructing an ensemble classifier. Techniques proposed in the paper helped us achieve an accuracy of 85% using Ensemble-Classifier which showcases that model has the capability of predicting the malignant cases correctly. The ensemble classifier consists of 5 machine learning models like SVM, LR, MLP, decision tree, and KNN. The inevitable parameters like accuracy, recall, and precision is calculated to determine the accurate results of the classifier.


2021 ◽  
Vol 5 (4) ◽  
pp. 5-9
Author(s):  
Svitlana Gavrylenko ◽  
Oleksii Hornostal

The subject of the research is methods and means of identifying the state of a computer system . The purpose of the article is to improve the quality of computer system state identification by developing a method based on ensemble classifiers. Task: to investigate methods for constructing bagging classifiers based on decision trees, to configure them and develop a method for identifying the state of the computer system. Methods used: artificial intelligence methods, machine learning, ensemble methods. The following results were obtained: the use of bagging classifiers based on meta-algorithms were investigated: Pasting Ensemble, Bootstrap Ensemble, Random Subspace Ensemble, Random Patches Ensemble and Random Forest methods and their accuracy were assessed to identify the state of the computer system. The research of tuning parameters of individual decision trees was carried out and their optimal values were found, including: the maximum number of features used in the construction of the tree; the minimum number of branches when building a tree; minimum number of leaves and maximum tree depth. The optimal number of trees in the ensemble has been determined. A method for identifying the state of the computer system is proposed, which differs from the known ones by the choice of the classification meta-algorithm and the selection of the optimal parameters for its adjustment. An assessment of the accuracy of the developed method for identifying the state of a computer system is carried out. The developed method is implemented in software and investigated when solving the problem of identifying the abnormal state of the computer system functioning. Conclusions. The scientific novelty of the results obtained lies in the development of a method for identifying the state of the computer system by choosing a meta-algorithm for classification and determining the optimal parameters for its configuration.


2021 ◽  
Vol 11 (23) ◽  
pp. 11252
Author(s):  
Ayana Mussabayeva ◽  
Prashant Kumar Jamwal ◽  
Muhammad Tahir Akhtar

Classification of brain signal features is a crucial process for any brain–computer interface (BCI) device, including speller systems. The positive P300 component of visual event-related potentials (ERPs) used in BCI spellers has individual variations of amplitude and latency that further changse with brain abnormalities such as amyotrophic lateral sclerosis (ALS). This leads to the necessity for the users to train the speller themselves, which is a very time-consuming procedure. To achieve subject-independence in a P300 speller, ensemble classifiers are proposed based on classical machine learning models, such as the support vector machine (SVM), linear discriminant analysis (LDA), k-nearest neighbors (kNN), and the convolutional neural network (CNN). The proposed voters were trained on healthy subjects’ data using a generic training approach. Different combinations of electroencephalography (EEG) channels were used for the experiments presented, resulting in single-channel, four-channel, and eight-channel classification. ALS patients’ data represented robust results, achieving more than 90% accuracy when using an ensemble of LDA, kNN, and SVM on four active EEG channels data in the occipital area of the brain. The results provided by the proposed ensemble voting models were on average about 5% more accurate than the results provided by the standalone classifiers. The proposed ensemble models could also outperform boosting algorithms in terms of computational complexity or accuracy. The proposed methodology shows the ability to be subject-independent, which means that the system trained on healthy subjects can be efficiently used for ALS patients. Applying this methodology for online speller systems removes the necessity to retrain the P300 speller.


2021 ◽  
Author(s):  
Wei Huang ◽  
Xingyu Zhao ◽  
Xiaowei Huang

AbstractThe embedding and extraction of knowledge is a recent trend in machine learning applications, e.g., to supplement training datasets that are small. Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and defence, respectively. This paper studies the embedding and extraction of knowledge in tree ensemble classifiers, and focuses on knowledge expressible with a generic form of Boolean formulas, e.g., point-wise robustness and backdoor attacks. For the embedding, it is required to be preservative (the original performance of the classifier is preserved), verifiable (the knowledge can be attested), and stealthy (the embedding cannot be easily detected). To facilitate this, we propose two novel, and effective embedding algorithms, one of which is for black-box settings and the other for white-box settings. The embedding can be done in PTIME. Beyond the embedding, we develop an algorithm to extract the embedded knowledge, by reducing the problem to be solvable with an SMT (satisfiability modulo theories) solver. While this novel algorithm can successfully extract knowledge, the reduction leads to an NP computation. Therefore, if applying embedding as backdoor attacks and extraction as defence, our results suggest a complexity gap (P vs. NP) between the attack and defence when working with tree ensemble classifiers. We apply our algorithms to a diverse set of datasets to validate our conclusion extensively.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Jyoti Godara ◽  
Isha Batra ◽  
Rajni Aron ◽  
Mohammad Shabaz

Cognitive science is a technology which focuses on analyzing the human brain using the application of DM. The databases are utilized to gather and store the large volume of data. The authenticated information is extracted using measures. This research work is based on detecting the sarcasm from the text data. This research work introduces a scheme to detect sarcasm based on PCA algorithm, K -means algorithm, and ensemble classification. The four ensemble classifiers are designed with the objective of detecting the sarcasm. The first ensemble classification algorithm (SKD) is the combination of SVM, KNN, and decision tree. In the second ensemble classifier (SLD), SVM, logistic regression, and decision tree classifiers are combined for the sarcasm detection. In the third ensemble model (MLD), MLP, logistic regression, and decision tree are combined, and the last one (SLM) is the combination of MLP, logistic regression, and SVM. The proposed model is implemented in Python and tested on five datasets of different sizes. The performance of the models is tested with regard to various metrics.


2021 ◽  
Vol 87 (11) ◽  
pp. 841-852
Author(s):  
S. Boukir ◽  
L. Guo ◽  
N. Chehata

In this article, margin theory is exploited to design better ensemble classifiers for remote sensing data. A semi-supervised version of the ensemble margin is at the core of this work. Some major challenges in ensemble learning are investigated using this paradigm in the difficult context of land cover classification: selecting the most informative instances to form an appropriate training set, and selecting the best ensemble members. The main contribution of this work lies in the explicit use of the ensemble margin as a decision method to select training data and base classifiers in an ensemble learning framework. The selection of training data is achieved through an innovative iterative guided bagging algorithm exploiting low-margin instances. The overall classification accuracy is improved by up to 3%, with more dramatic improvement in per-class accuracy (up to 12%). The selection of ensemble base classifiers is achieved by an ordering-based ensemble-selection algorithm relying on an original margin-based criterion that also targets low-margin instances. This method reduces the complexity (ensemble size under 30) but maintains performance.


Sign in / Sign up

Export Citation Format

Share Document