ensemble of classifiers
Recently Published Documents


TOTAL DOCUMENTS

231
(FIVE YEARS 48)

H-INDEX

26
(FIVE YEARS 5)

2022 ◽  
Author(s):  
Christopher Graney-Ward ◽  
Biju Issac ◽  
LIDA KETSBAIA ◽  
Seibu Mary Jacob

Due to the recent popularity and growth of social media platforms such as Facebook and Twitter, cyberbullying is becoming more and more prevalent. The current research on cyberbullying and the NLP techniques being used to classify this kind of online behaviour was initially studied. This paper discusses the experimentation with combined Twitter datasets by Maryland and Cornell universities using different classification approaches like classical machine learning, RNN, CNN, and pretrained transformer-based classifiers. A state of the art (SOTA) solution was achieved by optimising BERTweet on a Onecycle policy with a Decoupled weight decay optimiser (AdamW), improving the previous F1-score by up to 8.4%, resulting in 64.8% macro F1. Particle Swarm Optimisation was later used to optimise the ensemble model. The ensemble developed from the optimised BERTweet model and a collection of models with varying data representations, outperformed the standalone BERTweet model by 0.53% resulting in 65.33% macro F1 for TweetEval dataset and by 0.55% for combined datasets, resulting in 68.1% macro F1.


2022 ◽  
Author(s):  
Christopher Graney-Ward ◽  
Biju Issac ◽  
LIDA KETSBAIA ◽  
Seibu Mary Jacob

Due to the recent popularity and growth of social media platforms such as Facebook and Twitter, cyberbullying is becoming more and more prevalent. The current research on cyberbullying and the NLP techniques being used to classify this kind of online behaviour was initially studied. This paper discusses the experimentation with combined Twitter datasets by Maryland and Cornell universities using different classification approaches like classical machine learning, RNN, CNN, and pretrained transformer-based classifiers. A state of the art (SOTA) solution was achieved by optimising BERTweet on a Onecycle policy with a Decoupled weight decay optimiser (AdamW), improving the previous F1-score by up to 8.4%, resulting in 64.8% macro F1. Particle Swarm Optimisation was later used to optimise the ensemble model. The ensemble developed from the optimised BERTweet model and a collection of models with varying data representations, outperformed the standalone BERTweet model by 0.53% resulting in 65.33% macro F1 for TweetEval dataset and by 0.55% for combined datasets, resulting in 68.1% macro F1.


2021 ◽  
Author(s):  
Amgad M. Mohammed ◽  
Enrique Onieva ◽  
Michał Woźniak

Author(s):  
H. Benjamin Fredrick David ◽  
A. Suruliandi ◽  
S. P. Raja

Ensemble methods fabricate a sequence of classifiers for classifying fresh instances by procuring a weighted vote of their individual predictions. Toning down the error and increasing accuracy is an avant-garde problem in ensemble classification. This paper presents a novel generic object-oriented voting and weighting adapted stacking framework for utilizing an ensemble of classifiers for prediction. This universal framework operates based on the weighted average of the probabilities of any suite of base learners and the final prediction is the aggregate of their respective votes. For illustrative purposes, three familiar heterogeneous classifiers, such as the Support Vector Machine, [Formula: see text]-Nearest Neighbor and Naïve Bayes, are utilized as candidates for ensemble classification using the proposed stacked framework. Further, the ensemble classifier built upon the framework is compared with others and evaluated using various cross-validation levels and percentage splits on a range of benchmark datasets. The outcome distinguishes the framework from the competition. The proposed framework is used to predict the crime propensity of prisoners most accurately, with 99.9901% accuracy.


2021 ◽  
Vol 69 ◽  
pp. 81-102
Author(s):  
Chen Wang ◽  
Chengyuan Deng ◽  
Zhoulu Yu ◽  
Dafeng Hui ◽  
Xiaofeng Gong ◽  
...  

2021 ◽  
Author(s):  
Liu Ning ◽  
Qingfeng Tang ◽  
Kexue Luo

Abstract Background: Alzheimer’s Disease (AD) is a common dementia which affects linguistic function, memory, cognitive and visual spatial ability of the patients. More and more studies have been done to access non-invasive, accessible, cost-effective methods for the detection of AD, Speech is proved to have relationship with AD, so a time that AD can be diagnosed in a doctor’s office is coming.Methods: In our study, the ADRess dataset in 2020 was used to detect AD which was balanced in gender and age. First we extract three categories of feature parameters: acoustic feature extracted by opensmile software, bert embeddings automatically and complicated linguistic feature extraction manually. Linguistic features are based on the POS tag, lexical Richness, fluency, semantic feature. Then seven different classifiers are used for identifying AD from normal controls, including SVM, Logistic Regress, Random forest, Extra Trees, Adaboost, LightGBM and a novel ensemble approach with majority voting strategy which is applied to overcome the error caused by a base classifier. Finally ten-fold cross validation is adopted for the evaluation of our approach. In addition, individual features and their combine features are fed to six base classifiers and ensemble of classifier. Results: We get top-performing classify result on the test set with ensemble of classifiers, the best accuracy of which is 85.4%. The best performance of feature sets are linguistic features, the accuracy of which is 85.6% with LightGBM classifier, and SFS approach is used to manifest seven discriminative linguistic features. Conclusions: The statistical and experimental results illustrates the feasibility by using speech to predict AD effectively based on acoustic and linguistic feature parameters. Stronger classifier and discriminate features are vital for the final results. We emphasise the best linguistic features for predicting AD disease are based on the POS tag, lexical Richness, fluency, semantic feature. Ensemble of classifiers usually has a better performance than single classifier.


Sign in / Sign up

Export Citation Format

Share Document