scholarly journals OVERSAMPLING METHOD TO HANDLING IMBALANCED DATASETS PROBLEM IN BINARY LOGISTIC REGRESSION ALGORITHM

Author(s):  
Windyaning Ustyannie ◽  
S Suprapto

The class imbalance is a condition when one class has a higher percentage than the other then it can affect the accuracy. One method in data mining that can be used to classification is logistic regression method. The method used in this research is RWO-sampling method using random replicate approach for synthetic data generation on descrete attribute. The result of the research can handle the problem of class imbalance, RWO-sampling method with random replicate approach shows better accuracy than RWO-sampling method with roulette and ROS approach. The accuracy value for RWO-Sampling method with roulette and RWO-Sampling approach with random replicate approach has increased to an average of 15.55% of each dataset. As for comparithem with the ROS method has increased an average of 3.7% of each dataset. Furthermore, for testing the underfitting problem in logistic regression, the oversampling method is better than non-oversampling with an increase in accuracy value reaching an average of 2.3% of each dataset.

2007 ◽  
Author(s):  
Marek K. Jakubowski ◽  
David Pogorzala ◽  
Timothy J. Hattenberger ◽  
Scott D. Brown ◽  
John R. Schott

2004 ◽  
pp. 211-234 ◽  
Author(s):  
Lewis Girod ◽  
Ramesh Govindan ◽  
Deepak Ganesan ◽  
Deborah Estrin ◽  
Yan Yu

2021 ◽  
Author(s):  
Maria Lyssenko ◽  
Christoph Gladisch ◽  
Christian Heinzemann ◽  
Matthias Woehrle ◽  
Rudolph Triebel

Author(s):  
Fauzan Anggi Prasatya ◽  
Tjahja Muhandri ◽  
Eko Ruddy Cahyadi

The competition of food business is currently very strict and diverse product innovations. To achieve the market share and win the business competition needs to know the affecting success factors. This study has two main objectives that include the following to: (1) mapping the characteristics of non traditional street food entrepreneur in Serang City, (2) identify the most affected success factor of non traditional street food business. Sampling method was used by purposive sampling 100 respondents. The analytical method used descriptive analysis and binary logistic regression. This research showed most of successful vendor are woman, because they are very conscientious than mens and tend to avoid risk. Affecting success factors on non traditional street food business were price of the product, business name and start up capital.


Author(s):  
Daniel Jeske ◽  
Pengyue Lin ◽  
Carlos Rendon ◽  
Rui Xiao ◽  
Behrokh Samadi

2019 ◽  
Vol 30 (3) ◽  
pp. 627-648 ◽  
Author(s):  
Evelyn Buckwar ◽  
Massimiliano Tamborrino ◽  
Irene Tubikanec

Abstract Approximate Bayesian computation (ABC) has become one of the major tools of likelihood-free statistical inference in complex mathematical models. Simultaneously, stochastic differential equations (SDEs) have developed to an established tool for modelling time-dependent, real-world phenomena with underlying random effects. When applying ABC to stochastic models, two major difficulties arise: First, the derivation of effective summary statistics and proper distances is particularly challenging, since simulations from the stochastic process under the same parameter configuration result in different trajectories. Second, exact simulation schemes to generate trajectories from the stochastic model are rarely available, requiring the derivation of suitable numerical methods for the synthetic data generation. To obtain summaries that are less sensitive to the intrinsic stochasticity of the model, we propose to build up the statistical method (e.g. the choice of the summary statistics) on the underlying structural properties of the model. Here, we focus on the existence of an invariant measure and we map the data to their estimated invariant density and invariant spectral density. Then, to ensure that these model properties are kept in the synthetic data generation, we adopt measure-preserving numerical splitting schemes. The derived property-based and measure-preserving ABC method is illustrated on the broad class of partially observed Hamiltonian type SDEs, both with simulated data and with real electroencephalography data. The derived summaries are particularly robust to the model simulation, and this fact, combined with the proposed reliable numerical scheme, yields accurate ABC inference. In contrast, the inference returned using standard numerical methods (Euler–Maruyama discretisation) fails. The proposed ingredients can be incorporated into any type of ABC algorithm and directly applied to all SDEs that are characterised by an invariant distribution and for which a measure-preserving numerical method can be derived.


Sign in / Sign up

Export Citation Format

Share Document