scholarly journals The use of entropy based fuzzy membership on weighted logistic regression for the unbalanced data

2021 ◽  
Vol 880 (1) ◽  
pp. 012048
Author(s):  
Ajiwasesa Harumeka ◽  
Santi Wulan Purnami ◽  
Santi Puteri Rahayu

Abstract Logistic regression is a popular and powerful classification method. The addition of ridge regularization and optimization using a combination of linear conjugate gradients and IRLS, called Truncated Regularized Iteratively Re-weighted Least Square (TR-IRLS), can outperform Support Vector Machine (SVM) in terms of processing speed, especially when applied to large data and have competitive accuracy. However, neither SVM nor TR-IRLS is good enough when applied to unbalanced data. Fuzzy Support Vector Machine (FSVM) is an SVM development for unbalanced data that adds fuzzy membership to each observation. The fuzzy membership makes the interest of each observation in the minority class higher than the majority class. Meanwhile, TR-IRLS developed into a Rare Event Weighted Logistic Regression (RE-WLR) by adding weight to logistic regression and bias correction. The weighting of the RE-WLR depends on the undersampling scheme. It allows an “information loss”. Between FSVM and RE-WLR has a similarity, the weight based only on class differences (minority or majority). Entropy Based Fuzzy Support Vector Machine (EFSVM) is a method used to accommodate the weaknesses of FSVM by considering the class certainty of class observations. As a result, EFSVM is able to improve SVM performance for unbalanced data, even beating FSVM. For this reason, we use EF on the TR-IRLS algorithm to classify large and unbalanced data, as a proposed method. This method is called Entropy-Based Fuzzy Weighted Logistic Regression (EF-WLR). This Research shows the review of EF-WLR for unbalanced data classification.

2013 ◽  
Vol 475-476 ◽  
pp. 312-317
Author(s):  
Ping Zhou ◽  
Jin Lei Wang ◽  
Xian Kai Chen ◽  
Guan Jun Zhang

Since dataset usually contain noises, it is very helpful to find out and remove the noise in a preprocessing step. Fuzzy membership can measure a samples weight. The weight should be smaller for noise sample but bigger for important sample. Therefore, appropriate sample memberships are vital. The article proposed a novel approach, Membership Calculate based on Hierarchical Division (MCHD), to calculate the membership of training samples. MCHD uses the conception of dimension similarity, which develop a bottom-up clustering technique to calculate the sample membership iteratively. The experiment indicates that MCHD can effectively detect noise and removes them from the dataset. Fuzzy support vector machine based on MCHD outperforms most of approaches published recently and hold the better generalization ability to handle the noise.


2011 ◽  
Vol 109 ◽  
pp. 636-640
Author(s):  
Bo Tang ◽  
Min Xia

With China's rapid economic development, credit scoring has become very important. This paper presents a new fuzzy support vector machine algorithm used to solve the problems of credit scoring. The empirical results show that the proposed fuzzy membership model is valid ,the algorithm has good prediction accuracy and anti-noise ability.


2021 ◽  
Author(s):  
◽  
Onuwa Honey Stephen Okwuashi

<p><b>The urban expansion of Lagos continues unabated and calls for urgent concern. This thesis explored the use of both the conventional and unconventional techniques for modelling land use change. Two conventional methods (ordinary least squares and geographically weighted regression) were based on geographic information systems, while four unconventional methods (logistic regression, artificial neural networks, and two proposed types of support vector machine) were based on cellular automata. These techniques were evaluated using three land use epochs: 1963-1978, 1978-1984, and 1984-2000.</b></p> <p>The conventional methods make quite strong statistical assumptions, some of which are shown not to be met by the land use data at hand. Despite this, these methods do exhibit substantial agreement between observed and the predicted maps. The non cellular automata and cellular automata modelling were then implemented with the logistic regression, artificial neural network, support vector machine, and fuzzy support vector machine models, with model parameters set by k-fold cross-validation. The cellular automata predicted maps were more accurate than those of the non cellular automata.</p> <p>The cellular automata modelling results from the proposed support vector machine and fuzzy support vector machine were compared with those from the geographic information systems based geographically weighted regression, logistic regression, and artificial neural network. The results from the geographic information systems based geographically weighted regression were the best, followed by those from the support vector machine and fuzzy support vector machine, followed by the artificial neural network, and logistic regression. This research demonstrated that the proposed support vector machine and fuzzy support vector machine based cellular automata models are promising tools for land use change modelling.</p>


2018 ◽  
Vol 24 (23) ◽  
pp. 5681-5692
Author(s):  
Li Bing ◽  
Liang Yilong ◽  
Cheng Wei

To weaken the effects of the outliers or noise in classification, a fuzzy support vector machine (FSVM) based on environmental fuzzy membership is proposed. The environmental fuzzy membership considers not only the number of the similar samples nearby but also the distribution of the samples nearby. As more information of the samples is considered, the reliability and robustness of the FSVM is further enhanced, which can improve the classification performance, especially for overlapping samples. The classification performance of the proposed method is validated by numerical case studies, an experimental study for a breast cancer dataset, and an application to motor fault classification. Compared with the FSVM based on the k-nearest neighbor algorithm, the proposed method obtains more robust and accurate classification rates in all case studies.


2013 ◽  
Vol 756-759 ◽  
pp. 3399-3403
Author(s):  
Hua Duan ◽  
Yan Mei Hou

In order to overcome the issues that Support Vector Machine is sensitive to the outlier and noise points, Fuzzy Support Vector Machine (FSVM) is proposed. The key issue to solve the FSVM is determinate the fuzzy membership. This paper gives an overview of construction algorithm of the fuzzy membership. We also give an algorithm to solve FSVM that is derived from improved-SMO algorithm.


2021 ◽  
Author(s):  
◽  
Onuwa Honey Stephen Okwuashi

<p><b>The urban expansion of Lagos continues unabated and calls for urgent concern. This thesis explored the use of both the conventional and unconventional techniques for modelling land use change. Two conventional methods (ordinary least squares and geographically weighted regression) were based on geographic information systems, while four unconventional methods (logistic regression, artificial neural networks, and two proposed types of support vector machine) were based on cellular automata. These techniques were evaluated using three land use epochs: 1963-1978, 1978-1984, and 1984-2000.</b></p> <p>The conventional methods make quite strong statistical assumptions, some of which are shown not to be met by the land use data at hand. Despite this, these methods do exhibit substantial agreement between observed and the predicted maps. The non cellular automata and cellular automata modelling were then implemented with the logistic regression, artificial neural network, support vector machine, and fuzzy support vector machine models, with model parameters set by k-fold cross-validation. The cellular automata predicted maps were more accurate than those of the non cellular automata.</p> <p>The cellular automata modelling results from the proposed support vector machine and fuzzy support vector machine were compared with those from the geographic information systems based geographically weighted regression, logistic regression, and artificial neural network. The results from the geographic information systems based geographically weighted regression were the best, followed by those from the support vector machine and fuzzy support vector machine, followed by the artificial neural network, and logistic regression. This research demonstrated that the proposed support vector machine and fuzzy support vector machine based cellular automata models are promising tools for land use change modelling.</p>


2020 ◽  
Vol 10 (3) ◽  
pp. 1065 ◽  
Author(s):  
Wei Liu ◽  
LinLin Ci ◽  
LiPing Liu

Since SVM is sensitive to noises and outliers of system call sequence data. A new fuzzy support vector machine algorithm based on SVDD is presented in this paper. In our algorithm, the noises and outliers are identified by a hypersphere with minimum volume while containing the maximum of the samples. The definition of fuzzy membership is considered by not only the relation between a sample and hyperplane, but also relation between samples. For each sample inside the hypersphere, the fuzzy membership function is a linear function of the distance between the sample and the hyperplane. The greater the distance, the greater the weight coefficient. For each sample outside the hypersphere, the membership function is an exponential function of the distance between the sample and the hyperplane. The greater the distance, the smaller the weight coefficient. Compared with the traditional fuzzy membership definition based on the relation between a sample and its cluster center, our method effectively distinguishes the noises or outlies from support vectors and assigns them appropriate weight coefficients even though they are distributed on the boundary between the positive and the negative classes. The experiments show that the fuzzy support vector proposed in this paper is more robust than the support vector machine and fuzzy support vector machines based on the distance of a sample and its cluster center.


Sign in / Sign up

Export Citation Format

Share Document