Cost-sensitive ensemble learning: a unifying framework

Data Mining and Knowledge Discovery ◽

10.1007/s10618-021-00790-4 ◽

2021 ◽

Author(s):

George Petrides ◽

Wouter Verbeke

Keyword(s):

Random Forest ◽

Ensemble Learning ◽

Ensemble Methods ◽

Fine Grained ◽

Misclassification Errors ◽

Natural Extensions ◽

Different Types

AbstractOver the years, a plethora of cost-sensitive methods have been proposed for learning on data when different types of misclassification errors incur different costs. Our contribution is a unifying framework that provides a comprehensive and insightful overview on cost-sensitive ensemble methods, pinpointing their differences and similarities via a fine-grained categorization. Our framework contains natural extensions and generalisations of ideas across methods, be it AdaBoost, Bagging or Random Forest, and as a result not only yields all methods known to date but also some not previously considered.

Download Full-text

A New Random Forest Algorithm Based on Learning Automata

Computational Intelligence and Neuroscience ◽

10.1155/2021/5572781 ◽

2021 ◽

Vol 2021 ◽

pp. 1-19

Author(s):

Mohammad Savargiv ◽

Behrooz Masoumi ◽

Mohammad Reza Keyvanpour

Keyword(s):

Random Forest ◽

Ensemble Learning ◽

Simple Structure ◽

Dynamic Behaviour ◽

Learning Automata ◽

Problem Space ◽

Learning Methods ◽

Adaptive Capabilities ◽

Different Types ◽

Space Conditions

The goal of aggregating the base classifiers is to achieve an aggregated classifier that has a higher resolution than individual classifiers. Random forest is one of the types of ensemble learning methods that have been considered more than other ensemble learning methods due to its simple structure, ease of understanding, as well as higher efficiency than similar methods. The ability and efficiency of classical methods are always influenced by the data. The capabilities of independence from the data domain, and the ability to adapt to problem space conditions, are the most challenging issues about the different types of classifiers. In this paper, a method based on learning automata is presented, through which the adaptive capabilities of the problem space, as well as the independence of the data domain, are added to the random forest to increase its efficiency. Using the idea of reinforcement learning in the random forest has made it possible to address issues with data that have a dynamic behaviour. Dynamic behaviour refers to the variability in the behaviour of a data sample in different domains. Therefore, to evaluate the proposed method, and to create an environment with dynamic behaviour, different domains of data have been considered. In the proposed method, the idea is added to the random forest using learning automata. The reason for this choice is the simple structure of the learning automata and the compatibility of the learning automata with the problem space. The evaluation results confirm the improvement of random forest efficiency.

Download Full-text

Ensemble based on Accuracy and Diversity Weighting for Evolving Data Streams

The International Arab Journal of Information Technology ◽

10.34028/iajit/19/1/11 ◽

2022 ◽

Author(s):

Yange Sun ◽

Han Shao ◽

Bencai Zhang

Keyword(s):

Ensemble Learning ◽

Data Streams ◽

Concept Drift ◽

Ensemble Methods ◽

Current Data ◽

Ensemble Classification ◽

Crucial Issue ◽

Base Classifier ◽

Real World Applications ◽

Different Types

Ensemble classification is an actively researched paradigm that has received much attention due to increasing real-world applications. The crucial issue of ensemble learning is to construct a pool of base classifiers with accuracy and diversity. In this paper, unlike conventional data-streams oriented ensemble methods, we propose a novel Measure via both Accuracy and Diversity (MAD) instead of one of them to supervise ensemble learning. Based on MAD, a novel online ensemble method called Accuracy and Diversity weighted Ensemble (ADE) effectively handles concept drift in data streams. ADE mainly uses the following three steps to construct a concept-drift oriented ensemble: for the current data window, 1) a new base classifier is constructed based on the current concept when drift detect, 2) MAD is used to measure the performance of ensemble members, and 3) a newly built classifier replaces the worst base classifier. If the newly constructed classifier is the worst one, the replacement has not occurred. Comparing with the state-of-art algorithms, ADE exceeds the current best-related algorithm by 2.38% in average classification accuracy. Experimental results show that the proposed method can effectively adapt to different types of drifts.

Download Full-text

Identifying different types of urban land use dynamics using Point-of-interest (POI) and Random Forest algorithm: The case of Huizhou, China

Cities ◽

10.1016/j.cities.2021.103202 ◽

2021 ◽

Vol 114 ◽

pp. 103202

Author(s):

Rong Wu ◽

Jieyu Wang ◽

Dachuan Zhang ◽

Shaojian Wang

Keyword(s):

Land Use ◽

Random Forest ◽

Urban Land ◽

Urban Land Use ◽

Random Forest Algorithm ◽

Point Of Interest ◽

Land Use Dynamics ◽

Different Types

Download Full-text

Fine-grained static detection of obfuscation transforms using ensemble-learning and semantic reasoning

Proceedings of the 9th Workshop on Software Security, Protection, and Reverse Engineering - SSPREW9 '19 ◽

10.1145/3371307.3371313 ◽

2019 ◽

Author(s):

Ramtine Tofighi-Shirazi ◽

Irina Măriuca Asăvoae ◽

Philippe Elbaz-Vincent

Keyword(s):

Ensemble Learning ◽

Semantic Reasoning ◽

Fine Grained ◽

Static Detection

Download Full-text

A cooperative DDoS attack detection scheme based on entropy and ensemble learning in SDN

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-021-01957-9 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Shanshan Yu ◽

Jicheng Zhang ◽

Ju Liu ◽

Xiaoqing Zhang ◽

Yafeng Li ◽

...

Keyword(s):

Ensemble Learning ◽

Denial Of Service ◽

Attack Detection ◽

Coarse Grained ◽

Communication Overhead ◽

Detection Scheme ◽

Fine Grained ◽

Ddos Attack ◽

Network Status ◽

Ddos Attack Detection

AbstractIn order to solve the problem of distributed denial of service (DDoS) attack detection in software-defined network, we proposed a cooperative DDoS attack detection scheme based on entropy and ensemble learning. This method sets up a coarse-grained preliminary detection module based on entropy in the edge switch to monitor the network status in real time and report to the controller if any abnormality is found. Simultaneously, a fine-grained precise attack detection module is designed in the controller, and a ensemble learning-based algorithm is utilized to further identify abnormal traffic accurately. In this framework, the idle computing capability of edge switches is fully utilized with the design idea of edge computing to offload part of the detection task from the control plane to the data plane innovatively. Simulation results of two common DDoS attack methods, ICMP and SYN, show that the system can effectively detect DDoS attacks and greatly reduce the southbound communication overhead and the burden of the controller as well as the detection delay of the attacks.

Download Full-text

A Novel Deep Learning Approach of Convolutional Neural Network and Random Forest Classifier for Fine-grained Sentiment Classification

International Journal on Electrical Engineering and Informatics ◽

10.15676/ijeei.2021.13.2.13 ◽

2021 ◽

Vol 13 (2) ◽

pp. 465-476

Author(s):

Siji George C. G. ◽

◽

B Sumathi ◽

Keyword(s):

Neural Network ◽

Deep Learning ◽

Random Forest ◽

Convolutional Neural Network ◽

Random Forest Classifier ◽

Sentiment Classification ◽

Learning Approach ◽

Fine Grained

Download Full-text

Joint Symbol Rate-Modulation Format Identification and OSNR Estimation using Random Forest Based Ensemble Learning for Intermediate Nodes

IEEE Photonics Journal ◽

10.1109/jphot.2021.3117984 ◽

2021 ◽

pp. 1-1

Author(s):

Jia Chai ◽

Xue Chen ◽

Yan Zhao ◽

Tao Yang ◽

Danshi Wang ◽

...

Keyword(s):

Random Forest ◽

Ensemble Learning ◽

Rate Modulation ◽

Modulation Format ◽

Symbol Rate

Download Full-text

A NEW ENSEMBLE LEARNING ALGORITHM USING REGIONAL CLASSIFIERS

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213013500255 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350025 ◽

Cited By ~ 3

Author(s):

BYUNGWOO LEE ◽

SUNGHA CHOI ◽

BYONGHWA OH ◽

JIHOON YANG ◽

SUNGYONG PARK

Keyword(s):

Ensemble Learning ◽

Learning Algorithm ◽

Ensemble Methods ◽

Feature Space ◽

Training Data ◽

Learning Method ◽

Weighted Voting ◽

Ensemble Learning Algorithm ◽

Voting Scheme ◽

Base Learner

We present a new ensemble learning method that employs a set of regional classifiers, each of which learns to handle a subset of the training data. We split the training data and generate classifiers for different regions in the feature space. When classifying an instance, we apply a weighted voting scheme among the classifiers that include the instance in their region. We used 11 datasets to compare the performance of our new ensemble method with that of single classifiers as well as other ensemble methods such as RBE, bagging and Adaboost. As a result, we found that the performance of our method is comparable to that of Adaboost and bagging when the base learner is C4.5. In the remaining cases, our method outperformed other approaches.

Download Full-text

Bank Deposit Prediction Using Ensemble Learning

Artificial Intelligence Evolution ◽

10.37256/aie.222021880 ◽

2021 ◽

pp. 42-51

Author(s):

Muhammed J. A. Patwary ◽

S. Akter ◽

M. S. Bin Alam ◽

A. N. M. Rezaul Karim

Keyword(s):

Neural Network ◽

Ensemble Learning ◽

Banking Sector ◽

Performance Metrics ◽

Financial Institution ◽

Descriptive Analysis ◽

Ensemble Methods ◽

Support Vector ◽

Classification Algorithms ◽

Economic Depression

Bank deposit is one of the vital issues for any financial institution. It is very challenging to predict a customer if he/she can be a depositor by analyzing related information. Some recent reports demonstrate that economic depression and the continuous decline of the economy negatively impact business organizations and banking sectors. Due to such economic depression, banks cannot attract a customer's attention. Thus, marketing is preferred to be a handy tool for the banking sector to draw customers' attention for a term deposit. The purpose of this paper is to study the performance of ensemble learning algorithms which is a novel approach to predict whether a new customer will have a term deposit or not. A Portuguese retail bank data is used for our study, containing 45,211 phone contacts with 16 input attributes and one decision attribute. The data are preprocessed by using the Discretization technique. 40,690 samples are used for training the classifiers, and 4,521 samples are used for testing. In this work, the performance of the three mostly used classification algorithms named Support Vector Machine (SVM), Neural Network (NN), and Naive Bayes (NB) are analyzed. Then the ability of ensemble methods to improve the efficiency of basic classification algorithms is investigated and experimentally demonstrated. Experimental results exhibit that the performance metrics of Neural Network (Bagging) is higher than other ensemble methods. Its accuracy, sensitivity, and specificity are 96.62%, 97.14%, and 99.08%, respectively. Although all input attributes are considered in the classification method, in the end, a descriptive analysis has shown that some input attributes have more importance for this classification. Overall, it is shown that ensemble methods outperformed the traditional algorithms in this domain. We believe our contribution can be used as a depositor prediction system to provide additional support for bank deposit prediction.

Download Full-text

Ensemble Deep Learning Models for Heart Disease Classification: A Case Study from Mexico

Information ◽

10.3390/info11040207 ◽

2020 ◽

Vol 11 (4) ◽

pp. 207

Author(s):

Asma Baccouche ◽

Begonya Garcia-Zapirain ◽

Cristian Castillo Olea ◽

Adel Elmaghraby

Keyword(s):

Neural Network ◽

Heart Disease ◽

Ensemble Learning ◽

Heart Diseases ◽

Hypertensive Heart Disease ◽

Network Models ◽

Features Selection ◽

Neural Network Models ◽

Learning Framework ◽

Different Types

Heart diseases are highly ranked among the leading causes of mortality in the world. They have various types including vascular, ischemic, and hypertensive heart disease. A large number of medical features are reported for patients in the Electronic Health Records (EHR) that allow physicians to diagnose and monitor heart disease. We collected a dataset from Medica Norte Hospital in Mexico that includes 800 records and 141 indicators such as age, weight, glucose, blood pressure rate, and clinical symptoms. Distribution of the collected records is very unbalanced on the different types of heart disease, where 17% of records have hypertensive heart disease, 16% of records have ischemic heart disease, 7% of records have mixed heart disease, and 8% of records have valvular heart disease. Herein, we propose an ensemble-learning framework of different neural network models, and a method of aggregating random under-sampling. To improve the performance of the classification algorithms, we implement a data preprocessing step with features selection. Experiments were conducted with unidirectional and bidirectional neural network models and results showed that an ensemble classifier with a BiLSTM or BiGRU model with a CNN model had the best classification performance with accuracy and F1-score between 91% and 96% for the different types of heart disease. These results are competitive and promising for heart disease dataset. We showed that ensemble-learning framework based on deep models could overcome the problem of classifying an unbalanced heart disease dataset. Our proposed framework can lead to highly accurate models that are adapted for clinical real data and diagnosis use.

Download Full-text