Support Vector Machine Decision Trees with Rare Event Detection

2002 ◽  
Vol 4 (4) ◽  
pp. 225-242 ◽  
Author(s):  
Hoi-Ming Chi ◽  
Okan K. Ersoy
2021 ◽  
Vol 880 (1) ◽  
pp. 012048
Author(s):  
Ajiwasesa Harumeka ◽  
Santi Wulan Purnami ◽  
Santi Puteri Rahayu

Abstract Logistic regression is a popular and powerful classification method. The addition of ridge regularization and optimization using a combination of linear conjugate gradients and IRLS, called Truncated Regularized Iteratively Re-weighted Least Square (TR-IRLS), can outperform Support Vector Machine (SVM) in terms of processing speed, especially when applied to large data and have competitive accuracy. However, neither SVM nor TR-IRLS is good enough when applied to unbalanced data. Fuzzy Support Vector Machine (FSVM) is an SVM development for unbalanced data that adds fuzzy membership to each observation. The fuzzy membership makes the interest of each observation in the minority class higher than the majority class. Meanwhile, TR-IRLS developed into a Rare Event Weighted Logistic Regression (RE-WLR) by adding weight to logistic regression and bias correction. The weighting of the RE-WLR depends on the undersampling scheme. It allows an “information loss”. Between FSVM and RE-WLR has a similarity, the weight based only on class differences (minority or majority). Entropy Based Fuzzy Support Vector Machine (EFSVM) is a method used to accommodate the weaknesses of FSVM by considering the class certainty of class observations. As a result, EFSVM is able to improve SVM performance for unbalanced data, even beating FSVM. For this reason, we use EF on the TR-IRLS algorithm to classify large and unbalanced data, as a proposed method. This method is called Entropy-Based Fuzzy Weighted Logistic Regression (EF-WLR). This Research shows the review of EF-WLR for unbalanced data classification.


2021 ◽  
Vol 4 (1) ◽  
pp. 7-18
Author(s):  
Donata D Acula

This paper employed the intelligent approach based on machine learning categorized as base and ensemble methods in classifying the disaster risk in the Philippines. It focused on the Decision Trees, Support Vector Machine, Adaptive Boosting Algorithm with Decision Trees, and Support Vector Machine as base estimators. The research used the Exponential Regression for missing value imputation and converted the number of casualties, damaged houses, and properties into five (5) risk levels using Quantile Method. The 10-fold cross-validation was used to validate the proposed algorithms. The experiment shows that Decision Trees and Adaptive Decision Trees are the most suitable models for the disaster data with the score of more than 90%, more than 75%, more than  75%  in all the classification metrics (accuracy, precision, recall f1-score) when applied to classification risk levels of casualties, damaged houses and damaged properties respectively.


Robotica ◽  
2002 ◽  
Vol 20 (5) ◽  
pp. 499-508
Author(s):  
Jie Yang ◽  
Chenzhou Ye ◽  
Nianyi Chen

SummaryA software tool for data mining (DMiner-I) is introduced, which integrates pattern recognition (PCA, Fisher, clustering, HyperEnvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), and computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, HyperEnvelop, support vector machine and visualization. The principle, algorithms and knowledge representation of some function models of data mining are described. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining is realized byVisual C++under Windows 2000. The software tool of data mining has been satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.


Sign in / Sign up

Export Citation Format

Share Document