MWMOTE optimization for imbalanced data using complete linkage

Imbalanced data results in errors in the classification, such as WMMOTE, and can decrease its performance and accuracy. Clustering in MWMOTE can be optimized to improve synthetic data generation and improve MWMOTE performance. This study aims to optimize the MWMOTE algorithm's performance in the clustering process in making synthetic data with complete linkage (CL). The dataset used a variety of data ratios to handle imbalanced data. The decision tree is used to determine the performance of MWMOTE and CL-MWMOTE oversampling. CL-MWMOTE evaluation results provide good, optimal performance and increase precision 0.53 %, 0.66 % recall, 0.67 % accuracy, and f-measure 0.65 %.

Download Full-text

Machine learning based Synthetic Data Generation using Iterative Regression Analysis

2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA) ◽

10.1109/iceca49313.2020.9297491 ◽

2020 ◽

Author(s):

Sanskar Shah ◽

Darshan Gandhi ◽

Jil Kothari

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

Synthetic data generation of high-resolution hyperspectral data using DIRSIG

10.1117/12.735264 ◽

2007 ◽

Cited By ~ 2

Author(s):

Marek K. Jakubowski ◽

David Pogorzala ◽

Timothy J. Hattenberger ◽

Scott D. Brown ◽

John R. Schott

Keyword(s):

High Resolution ◽

Synthetic Data ◽

Hyperspectral Data ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

Synthetic Data Generation to Support Irregular Sampling in Sensor Networks

GeoSensor Networks ◽

10.1201/9780203356869.ch12 ◽

2004 ◽

pp. 211-234 ◽

Cited By ~ 2

Author(s):

Lewis Girod ◽

Ramesh Govindan ◽

Deepak Ganesan ◽

Deborah Estrin ◽

Yan Yu

Keyword(s):

Sensor Networks ◽

Synthetic Data ◽

Irregular Sampling ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

Instance Segmentation in CARLA: Methodology and Analysis for Pedestrian-oriented Synthetic Data Generation in Crowded Scenes

10.1109/iccvw54120.2021.00115 ◽

2021 ◽

Author(s):

Maria Lyssenko ◽

Christoph Gladisch ◽

Christian Heinzemann ◽

Matthias Woehrle ◽

Rudolph Triebel

Keyword(s):

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation ◽

Crowded Scenes ◽

Instance Segmentation

Download Full-text

Synthetic Data Generation Capabilties for Testing Data Mining Tools

MILCOM 2006 ◽

10.1109/milcom.2006.302440 ◽

2006 ◽

Cited By ~ 7

Author(s):

Daniel Jeske ◽

Pengyue Lin ◽

Carlos Rendon ◽

Rui Xiao ◽

Behrokh Samadi

Keyword(s):

Data Mining ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation ◽

Testing Data ◽

Mining Tools

Download Full-text

When does Synthetic Data Generation Work?

2021 29th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu53274.2021.9477956 ◽

2021 ◽

Author(s):

Ahmet Topal ◽

Mehmet Fatih Amasyali

Keyword(s):

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

A Synthetic Data Generation Model for Diabetic Foot Treatment

Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications - Communications in Computer and Information Science ◽

10.1007/978-981-33-4370-2_18 ◽

2020 ◽

pp. 249-264

Author(s):

Jayun Hyun ◽

Seo Hu Lee ◽

Ha Min Son ◽

Ji-Ung Park ◽

Tai-Myoung Chung

Keyword(s):

Diabetic Foot ◽

Synthetic Data ◽

Generation Model ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text

Ground penetrating radar measurements: Applications to synthetic data generation and target characterization

2010 IEEE International Geoscience and Remote Sensing Symposium ◽

10.1109/igarss.2010.5650683 ◽

2010 ◽

Cited By ~ 1

Author(s):

Naomi R. Schwartz ◽

Amir I. Zaghloul

Keyword(s):

Ground Penetrating Radar ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation ◽

Target Characterization ◽

Radar Measurements ◽

Ground Penetrating

Download Full-text

Spectral density-based and measure-preserving ABC for partially observed diffusion processes. An illustration on Hamiltonian SDEs

Statistics and Computing ◽

10.1007/s11222-019-09909-6 ◽

2019 ◽

Vol 30 (3) ◽

pp. 627-648 ◽

Cited By ~ 1

Author(s):

Evelyn Buckwar ◽

Massimiliano Tamborrino ◽

Irene Tubikanec

Keyword(s):

Numerical Methods ◽

Spectral Density ◽

Diffusion Processes ◽

Model Simulation ◽

Broad Class ◽

Synthetic Data ◽

Summary Statistics ◽

Data Generation ◽

Synthetic Data Generation ◽

Partially Observed

Abstract Approximate Bayesian computation (ABC) has become one of the major tools of likelihood-free statistical inference in complex mathematical models. Simultaneously, stochastic differential equations (SDEs) have developed to an established tool for modelling time-dependent, real-world phenomena with underlying random effects. When applying ABC to stochastic models, two major difficulties arise: First, the derivation of effective summary statistics and proper distances is particularly challenging, since simulations from the stochastic process under the same parameter configuration result in different trajectories. Second, exact simulation schemes to generate trajectories from the stochastic model are rarely available, requiring the derivation of suitable numerical methods for the synthetic data generation. To obtain summaries that are less sensitive to the intrinsic stochasticity of the model, we propose to build up the statistical method (e.g. the choice of the summary statistics) on the underlying structural properties of the model. Here, we focus on the existence of an invariant measure and we map the data to their estimated invariant density and invariant spectral density. Then, to ensure that these model properties are kept in the synthetic data generation, we adopt measure-preserving numerical splitting schemes. The derived property-based and measure-preserving ABC method is illustrated on the broad class of partially observed Hamiltonian type SDEs, both with simulated data and with real electroencephalography data. The derived summaries are particularly robust to the model simulation, and this fact, combined with the proposed reliable numerical scheme, yields accurate ABC inference. In contrast, the inference returned using standard numerical methods (Euler–Maruyama discretisation) fails. The proposed ingredients can be incorporated into any type of ABC algorithm and directly applied to all SDEs that are characterised by an invariant distribution and for which a measure-preserving numerical method can be derived.

Download Full-text

Synthetic Data Generation to Mitigate the Low/No-Shot Problem in Machine Learning

2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR) ◽

10.1109/aipr47015.2019.9174596 ◽

2019 ◽

Cited By ~ 1

Author(s):

Emily E. Berkson ◽

Jared D. VanCor ◽

Steven Esposito ◽

Gary Chern ◽

Mark Pritt

Keyword(s):

Machine Learning ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation

Download Full-text