scholarly journals COMBINATION OF SYNTHETIC MINORITY OVERSAMPLING TECHNIQUE (SMOTE) AND BACKPROPAGATION NEURAL NETWORK TO CONTRACEPTIVE IUD PREDICTION

2020 ◽  
Vol 13 (1) ◽  
pp. 36-46
Author(s):  
Mustaqim Mustaqim ◽  
Budi Warsito ◽  
Bayu Surarso

Data imbalance occurs when the amount of data in a class is more than other data. The majority class is more data, while the minority class is fewer. Imbalance class will decrease the performance of the classification algorithm. Data on IUD contraceptive use is imbalanced data. National IUD failure in 2018 was 959 or 3.5% from 27.400 users. Synthetic minority oversampling technique (SMOTE) is used to balance data on IUD failure. Balanced data is then predicted with neural networks. The system is for predicting someone when using IUD whether they have a pregnancy or not. This study uses 250 data with 235 major data (not pregnant) and 15 minor data (pregnant). From 250 data divided into two parts, 225 training and 25 testing data. Minority class on training data will be duplicated to 1524%, so that the amount of minority data become balanced with  the majority data. The results of predictive with an accuracy rate of  99.9% at 1000 epoch.

2019 ◽  
Vol 5 (2) ◽  
pp. 128
Author(s):  
Mustaqim Mustaqim ◽  
Budi Warsito ◽  
Bayu Surarso

Combination of Synthetic Minority Oversampling Technique (SMOTE) and Backpropagation Neural Network to handle imbalanced class in predicting the use of contraceptive implants  Kegagalan akibat pemakaian alat kontrasepsi implan merupakan terjadinya kehamilan pada wanita saat menggunakan alat kontrasepsi secara benar. Kegagalan pemakaian kontrasepsi implan tahun 2018 secara nasional sejumlah 1.852 pengguna atau 4% dari 41.947 pengguna. Rasio angka kegagalan dan keberhasilan pemakaian kontrasepsi implan yang cenderung tidak seimbang (imbalance class) membuatnya sulit diprediksi. Ketidakseimbangan data terjadi jika jumlah data suatu kelas lebih banyak dari data lain. Kelas mayor merupakan jumlah data yang lebih banyak, sedangkan kelas minor jumlahnya lebih sedikit. Algoritma klasifikasi akan mengalami penurunan performa jika menghadapi kelas yang tidak seimbang. Synthetic Minority Oversampling Technique (SMOTE) digunakan untuk menyeimbangkan data kegagalan pemakaian kontrasepsi implan. SMOTE menghasilkan akurasi yang baik dan efektif daripada metode oversampling lainnya dalam menangani imbalance class karena mengurangi overfitting. Data yang sudah seimbang kemudian diprediksi dengan Neural Network Backpropagation. Sistem prediksi ini digunakan untuk mendeteksi apakah seorang wanita mengalami kehamilan atau tidak jika menggunakan kontrasepsi implan. Penelitian ini menggunakan 300 data, terdiri dari 285 data mayor (tidak hamil) dan 15 data minor (hamil). Dari 300 data dibagi menjadi dua bagian, 270 data latih dan 30 data uji. Dari 270 data latih, terdapat 13 data latih minor dan 257 data latih mayor. Data latih minor pada data latih diduplikasi sebanyak data pada kelas mayor sehingga jumlah data latih menjadi 514, terdiri dari 257 data mayor, 13 data minor asli, dan 244 data minor buatan. Sistem prediksi menghasilkan nilai akurasi sebesar 96,1% pada epoch ke-500 dan 1.000. Implementasi kombinasi SMOTE dan Neural Network Backpropagation terbukti mampu memprediksi pada imbalance class dengan hasil prediksi yang baik.  The failed contraceptive implant is one of the sources of unintended pregnancy in women. The number of users experiencing contraceptive-implant failure in 2018 was 1,852 nationally or 4% out of 41,947 users. The ratio between failure and success rates of contraceptive implant, which tended to be unbalanced (imbalance class), made it difficult to predict. Imbalance class will occur if the amount of data in one class is bigger than that in other classes. Major classes represent a bigger amount of data, while minor classes are smaller ones. The imbalance class will decrease the performance of the classification algorithm. The Synthetic Minority Oversampling Technique (SMOTE) was used to balance the data of the contraceptive implant failures. SMOTE resulted in better and more effective accuracy than other oversampling methods in handling the imbalance class because it reduced overfitting. The balanced data were then predicted using backpropagation neural networks. The prediction system was used to detect if a woman using a contraceptive implant was pregnant or not. This study used 300 data, consisting of 285 major data (not pregnant) and 15 minor data (pregnant). Of 300 data, two groups of data were formed: 270 training data and 30 testing data. Of 270 training data, 13 were minor training data and 257 were major training data. The minor training data in the training data were duplicated as much as the number of data in major classes so that the total training data became 514, consisting of 257 major data, 13 original minor data, and 244 artificial minor data. The prediction system resulted in an accuracy of 96.1% on the 500th and 1,000th epochs. The combination of SMOTE and Backpropagation Neural Network was proven to be able to make a good prediction result in imbalance class.


Author(s):  
S. K. Gupta ◽  
M. Jhunjhunwalla ◽  
A. Bhardwaj ◽  
D. P. Shukla

Abstract. Machine learning methods such as artificial neural network, support vector machine etc. require a large amount of training data, however, the number of landslide occurrences are limited in a study area. The limited number of landslides leads to a small number of positive class pixels in the training data. On contrary, the number of non-landslide pixels (negative class pixels) are enormous in numbers. This under-represented data and severe class distribution skew create a data imbalance for learning algorithms and suboptimal models, which are biased towards the majority class (non-landslide pixels) and have low performance on the minority class (landslide pixels).In this work, we have used two algorithms namely EasyEnsemble and BalanceCascade for balancing the data. This balanced data is used with feature selection methods such as fisher discriminant analysis (FDA), logistic regression (LR) and artificial neural network (ANN) to generate LSZ maps The results of the study show that ANN with balanced data has major improvements in preparation of susceptibility maps over imbalanced data, where as the LR method is ill-effected by data balancing algorithms. The FDA does not show significant changes between balanced and imbalanced data.


Mekatronika ◽  
2019 ◽  
Vol 1 (1) ◽  
pp. 80-86
Author(s):  
Ooi Peng Toon ◽  
Muhammad Aizzat Zakaria ◽  
Ahmad Fakhri Ab. Nasir ◽  
Anwar P.P. Abdul Majeed ◽  
Chung Young Tan ◽  
...  

Solanum lycopersicum or generally known as tomato came from countries of South America and has been growing in many tropical countries and its healthy nutrients in tomato becomes one of the food demand by the locals in Malaysia when their lifestyle shifted to more concern for healthy food. Since export value and production has increased for the past few years, a vast amount of labours considered for the fruit-picking process. Hence, farmers are now preferring to look for automation to replace labour problems and high cost that they are facing. To pick a correct fruit within clusters, a harvesting robot requires guidance so that it can detect a fruit accurately. In this study, a new classification algorithm using deep learning specifically convolution neural network to classify the image is either a tomato or not tomato and next, the image is classified into either a ripe or unripe tomato. Furthermore, there are two classification neural networks which are tomato or not tomato and ripe and unripe tomato. Each network consists of 600 training data and 33 testing data. The accuracies that obtained from network 1 (tomato or not tomato) and network 2 (ripe or unripe tomato) are 76.366% and 98.788% respectively.


Author(s):  
Widya Tri Charisma Gultom ◽  
Anjar Wanto ◽  
Indra Gunawan ◽  
Muhammad Ridwan Lubis ◽  
Ika Okta Kirana

Criminality is an act that violates the law that can disturb society and even harm society both economically and psychologically. The number of crimes cannot be ascertained over time because the numbers are uncertain. So that the police have difficulty in overcoming criminal acts. With this research, the police can find out the number of criminals that will occur through the prediction that has been made. So that the police can prevent the number of criminals and increase security in Pematangsiantar city. This study uses an artificial neural network with the Levenberg Marquardt method. The research data is sourced from the Pematangsiantar Police Criminal Investigation Agency (Reskrim) in 2014-2019. The data is divided into 2 parts, namely training data and testing data. There are 5 architectural models used in this study, namely 3-30-1, 3-31-1, 3-32-1, 3-36-1 and 3-38-1. Of the 5 architectural models used, the best architecture is 3-36-1 with an accuracy rate of 85%, MSE 0.1465119, and a maximum iteration of 10000, the results obtained from the best architecture in 2020 are 85% with the number of criminals 394 people, in 2021 it is 62 % totaled 238 people, in 2022, namely 69% amounted to 170 people, so this model is good for predicting the number of crimes in Pematangsiantar City.


Author(s):  
Yuli Andriani ◽  
Anjar Wanto ◽  
Handrizal Handrizal

Predictions are used to determine how much the rate of increase or decrease in oil palm production at PT. Kerasaan Indonesia (KRE) in the future. This study uses Artificial Neural Networks (ANN) using the Levenberg Marquardt method. The research data is secondary data sourced from PT. Kerasaan Indonesia from 2002 to 2017. Data is divided into 2 parts, namely training data and testing data. There are 5 architectural models used in this study, 7-10-1, 7-20-1, 7-30-1, 7-40-1 and 7-50-1. Of the 5 architectural models used, the best architecture is 7-50-1 by producing an accuracy rate of 83%, MSE 1.1471332321 and a maximum iteration of 1000. So this model is good for predicting coconut production palm oil at PT. Indonesian feeling because of its accuracy between 80% and 90%.


Author(s):  
Zulfikar Zulfikar ◽  
Anjar Wanto ◽  
Zulaini Masruro Nasution

The Large Trade Price Index (IHPB) is one of the economic indicators that contains index numbers and shows changes in the price of goods purchased by traders from consumers. This study uses Artificial Neural Networks (ANN) with the Backpropagation method. Artificial neural networks are branches of artificial intelligence that mimic or imitate the workings of the human brain. The data of this study are secondary data sourced from the Central Statistics Agency (BPS) from 2000 to 2017. The data is divided into 2 parts, namely training data and testing data. There are 5 architectural models used in this study. 8-15-1, 8-25-1, 8-26-1, 8-30-1 and 8-40-1. From the 5 architectural models used 1 best model was obtained, namely 8-25-1 with an accuracy rate of 85%, MSE 0.00100074 and 10000 iterations. So this model is good for predicting large trade price indexes according to sectors in Indonesia in the future.


1992 ◽  
Vol 26 (9-11) ◽  
pp. 2461-2464 ◽  
Author(s):  
R. D. Tyagi ◽  
Y. G. Du

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.


2011 ◽  
Vol 189-193 ◽  
pp. 2042-2045 ◽  
Author(s):  
Shang Jen Chuang ◽  
Chiung Hsing Chen ◽  
Chien Chih Kao ◽  
Fang Tsung Liu

English letters cannot be recognized by the Hopfield Neural Network if it contains noise over 50%. This paper proposes a new method to improve recognition rate of the Hopfield Neural Network. To advance it, we add the Gaussian distribution feature to the Hopfield Neural Network. The Gaussian filter was added to eliminate noise and improve Hopfield Neural Network’s recognition rate. We use English letters from ‘A’ to ‘Z’ as training data. The noises from 0% to 100% were generated randomly for testing data. Initially, we use the Gaussian filter to eliminate noise and then to recognize test pattern by Hopfield Neural Network. The results are we found that if letters contain noise between 50% and 53% will become reverse phenomenon or unable recognition [6]. In this paper, we propose to uses multiple filters to improve recognition rate when letters contain noise between 50% and 53%.


2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.


Author(s):  
Brian Bucci ◽  
Jeffrey Vipperman

In extension of previous methods to identify military impulse noise in the civilian environmental noise monitoring setting by means of a set of computed scalar metrics input to artificial neural network structures, Bayesian methods are investigated to classify the same dataset. Four interesting cases are identified and analyzed: A) Maximum accuracy achieve on training data, B) Maximum overall accuracy on blind testing data, C) Maximum accuracy on testing data with zero false positive detections, D) Maximum accuracy on testing data with zero false negative rejections. The first case is used to illustrative example and the later three represent actual monitoring modes. All of the cases are compared and contrasted to illuminate respective strengths and weaknesses. Overall accuracies of up to 99.8% are observed with no false negative rejections and accuracies of up to 98.4% are also achieved with no false positive detections.


Sign in / Sign up

Export Citation Format

Share Document