scholarly journals Penerapan PSO Over Sampling Dan Adaboost Random Forest Untuk Memprediksi Cacat Software

2020 ◽  
Vol 6 (2) ◽  
pp. 230-239
Author(s):  
Richky Faizal Amir ◽  
Irwan Agus Sobari ◽  
Rousyati Rousyati

Abstract: The dataset of software metrics, in general, are not balanced (Imbalanced). Class imbalance in Dataset can reduce the performance of software defect prediction models, because it tends to produce majority class predictions from minority classes, the dataset used in this study uses the National Aeronautics and Space Administration (NASA) Metrics Data Program (MDP), dataset From Stages Pre-processing proposed the Particle Swarm Optimization (PSO). method to overcome the problem of attributes in the training data and the Random Over Sampling (ROS) Resampling method. to deal with class imbalances. This study proposes that the Random Forest method combined with Adaboost can estimate the level of disability of software through training data. The results of this study indicate that the Resampling + Adaboost + Random Forest algorithm can be used to predict software defects with an average accuracy of 94.70% and a value of AUC 0.939. While the PSO + Random Forest algorithm only has an average accuracy of 89.60% and AUC 0.636 the difference in the accuracy of the two models is 5.10% and AUC 0.303. Statistical tests show that there is a significant influence between the proposed model and the Random Forest model with a p-value (0.036) smaller than the alpha value (0.05), which means there is a significant difference between the two models.Keywords: Imbalanced Class, Resample, Particle Swarm Optimization, Random Forest, Adaboost, Software DefectAbstrak: Dataset dari software matrik secara umum bersifat tidak seimbang (Imbalanced). Ketidak seimbangan kelas yang ada dalam dataset dapat menurunkan kinerja model prediksi cacat software, karena cenderung menghasilkan prediksi kelas mayoritas dari kelas minoritas. Dataset yang digunakan pada penelitian ini menggunakan dataset National Aeronautics and Space Administration (NASA) Metrics Data Program (MDP). Dari tahapan pra pemrosesan diusulkan metode Particle Swarm Optimization (PSO) untuk mengatasi masalah attribute pada data training dan metode Resampling Random Over Sampling (ROS). untuk menangani ketidak seimbangan kelas. Penelitian ini mengusulkan metode Random Forest yang dikombinasikan dengan Adaboost dapat mengestimasi tingkat kecacatan suatu Software melalui data training, Dari Hasil penelitian ini menunjukan bahwa algoritma Resampling+Adaboost+Random Forest dapat digunakan untuk memprediksi cacat software dengan rata-rata akurasi 94,70% dan nilai AUC 0,939. Sementara algoritma PSO+Random Forest hanya memiliki rata-rata akurasi 89,60% dan AUC 0,636 perbedaan akurasi dari kedua model tersebut 5,10% dan AUC 0,303. Uji statistik menunjukan bahwa adanya pengaruh yang signifikan antara model usulan dengan model Random Forest dengan nilai p (0,036) lebih kecil dari nilai alpha (0,05) yang artinya terdapat perbedaan yang siginifkan antara kedua model.Kata kunci: Imbalanced Class, Resample, Particle Swarm Optimization, Random Forest, Adaboost, Kecacatan Software

2021 ◽  
pp. 1-15
Author(s):  
Zhaozhao Xu ◽  
Derong Shen ◽  
Yue Kou ◽  
Tiezheng Nie

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.


2018 ◽  
Vol 18 (2) ◽  
pp. 36-50
Author(s):  
Samira Bordbar ◽  
Pirooz Shamsinejad

Abstract Opinion Mining or Sentiment Analysis is the task of extracting people final opinion about something through their unstructured sentiments. The Opinion Mining process is as follows: first, product features which are most important to a user are extracted from his/her comments. Then, sentiments will be emotionally classified using their emotional implications. In this paper we propose an opinion classification method based on Fuzzy Logic. Up to now, a few methods have taken advantage of fuzzy logic in opinion classification and all of them have imported fuzzy rules into system as background knowledge. But the main challenge here is finding the fuzzy rules. Our contribution is to automatically extract fuzzy rules and their parameters from training data. Here we have used the Particle Swarm Optimization (PSO) algorithm to extract fuzzy rules from training data. Also, for better results we have devised a mutation-based PSO. All proposed methods have been implemented and tested on relevant data. Results confirm that our method can reach better accuracy than current state of the art methods in this domain.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3028 ◽  
Author(s):  
Zina Li ◽  
Lina Qiu ◽  
Ruixin Li ◽  
Zhipeng He ◽  
Jun Xiao ◽  
...  

Electroencephalogram (EEG) signals have been widely used in emotion recognition. However, the current EEG-based emotion recognition has low accuracy of emotion classification, and its real-time application is limited. In order to address these issues, in this paper, we proposed an improved feature selection algorithm to recognize subjects’ emotion states based on EEG signal, and combined this feature selection method to design an online emotion recognition brain-computer interface (BCI) system. Specifically, first, different dimensional features from the time-domain, frequency domain, and time-frequency domain were extracted. Then, a modified particle swarm optimization (PSO) method with multi-stage linearly-decreasing inertia weight (MLDW) was purposed for feature selection. The MLDW algorithm can be used to easily refine the process of decreasing the inertia weight. Finally, the emotion types were classified by the support vector machine classifier. We extracted different features from the EEG data in the DEAP data set collected by 32 subjects to perform two offline experiments. Our results showed that the average accuracy of four-class emotion recognition reached 76.67%. Compared with the latest benchmark, our proposed MLDW-PSO feature selection improves the accuracy of EEG-based emotion recognition. To further validate the efficiency of the MLDW-PSO feature selection method, we developed an online two-class emotion recognition system evoked by Chinese videos, which achieved good performance for 10 healthy subjects with an average accuracy of 89.5%. The effectiveness of our method was thus demonstrated.


2017 ◽  
Vol 16 (03) ◽  
pp. 205-226 ◽  
Author(s):  
Vimal Kumar Pathak ◽  
Amit Kumar Singh

Form error evaluation of manufactured parts is one of the crucial aspects of precision coordinate metrology. With the advent of technology, the noncontact data acquisition techniques are replacing the conventional machines like coordinate measuring machine (CMM). This paper presents an optimization technique to evaluate minimum zone form errors, namely straightness, circularity, flatness and cylindricity using constriction factor-based particle swarm optimization (CFPSO) algorithm. Addition of constriction factor helps in accelerating the convergence property of CFPSO. Initially, a simple minimum zone objective function is formulated mathematically for each form error and then optimized using the proposed CFPSO. Primarily, the results of the proposed method for form error evaluation are compared with the literature results. Furthermore, the data obtained from noncontact 3D scanner is processed and the results of form error evaluation using CFPSO algorithm are compared with Steinbichler’s INSPECT PLUS software results. It was found that the results obtained using the proposed CFPSO algorithm are fast and better as compared with other evolutionary techniques like genetic algorithm (GA), previous literatures and software results. Furthermore, to ensure effectiveness of the proposed method statistical analysis ([Formula: see text]-test) was performed. CFPSO results for large dimension of problem show significant difference in computation time as compared with GA. The CFPSO algorithm provides 27.25%, 7.5% and 6.38% improvements in circularity, flatness and cylindricity, respectively, in comparison to RE software results, for determination of minimum zone error. Thus, the methodology presented helps in improving the accuracy and for speeding up of the automated inspection process generally performed by CMMs in industries.


2020 ◽  
Vol 12 (2) ◽  
pp. 168-175
Author(s):  
Sumarni Sumarni ◽  
Suhardi Rustam

Problems the Topic of the final project is a form of scientific writing that contains the results of observations from a study of the problems that occur with the use of methods related to the particular field of science. Every student in every program of study must draw up a final project. However, before embarking on writing the final project, each student must have the topic area as a destination, the step of selection the topic of final project is an initial step before working on the final task. One way to get the final task is to see the value of general courses as well as courses, concentration majors, the value of which dominate the is is decent to scope the research topic. this research is conducted on the application of the method of K-Nearest Neighbor (KNN) for categorization of the value of the courses of concentration for the coverage of the research topic, topic the entire value in the dataset will be classified by KNN and in the optimization with the Particle swarm Optimization algorithm (PSO). The experimental categorization of the final project is built with the training data Mahasiswa Universitas Ichsan Gorontalo that has been classified previously and test data derived from the entire value of the courses is not yet known categories. The results of the experiments, the value of the resulting accuracy of algorithms KNN, namely the value of the best accuracy with K=3, K Folds = 10 has an accuracy that is 72.46% and the Algorithm of KNN-PSO best accuracy with K=3, K Folds = 10 has an accuracy that is 89.86%, shows the accuracy is better by using the optimization algorithm


Sign in / Sign up

Export Citation Format

Share Document