scholarly journals A novel hybrid surrogate intelligent model for creep index prediction based on particle swarm optimization and random forest

2020 ◽  
Vol 265 ◽  
pp. 105328 ◽  
Author(s):  
Pin Zhang ◽  
Zhen-Yu Yin ◽  
Yin-Fu Jin ◽  
Tommy H.T. Chan
2021 ◽  
pp. 1-15
Author(s):  
Zhaozhao Xu ◽  
Derong Shen ◽  
Yue Kou ◽  
Tiezheng Nie

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.


2020 ◽  
Vol 6 (2) ◽  
pp. 230-239
Author(s):  
Richky Faizal Amir ◽  
Irwan Agus Sobari ◽  
Rousyati Rousyati

Abstract: The dataset of software metrics, in general, are not balanced (Imbalanced). Class imbalance in Dataset can reduce the performance of software defect prediction models, because it tends to produce majority class predictions from minority classes, the dataset used in this study uses the National Aeronautics and Space Administration (NASA) Metrics Data Program (MDP), dataset From Stages Pre-processing proposed the Particle Swarm Optimization (PSO). method to overcome the problem of attributes in the training data and the Random Over Sampling (ROS) Resampling method. to deal with class imbalances. This study proposes that the Random Forest method combined with Adaboost can estimate the level of disability of software through training data. The results of this study indicate that the Resampling + Adaboost + Random Forest algorithm can be used to predict software defects with an average accuracy of 94.70% and a value of AUC 0.939. While the PSO + Random Forest algorithm only has an average accuracy of 89.60% and AUC 0.636 the difference in the accuracy of the two models is 5.10% and AUC 0.303. Statistical tests show that there is a significant influence between the proposed model and the Random Forest model with a p-value (0.036) smaller than the alpha value (0.05), which means there is a significant difference between the two models.Keywords: Imbalanced Class, Resample, Particle Swarm Optimization, Random Forest, Adaboost, Software DefectAbstrak: Dataset dari software matrik secara umum bersifat tidak seimbang (Imbalanced). Ketidak seimbangan kelas yang ada dalam dataset dapat menurunkan kinerja model prediksi cacat software, karena cenderung menghasilkan prediksi kelas mayoritas dari kelas minoritas. Dataset yang digunakan pada penelitian ini menggunakan dataset National Aeronautics and Space Administration (NASA) Metrics Data Program (MDP). Dari tahapan pra pemrosesan diusulkan metode Particle Swarm Optimization (PSO) untuk mengatasi masalah attribute pada data training dan metode Resampling Random Over Sampling (ROS). untuk menangani ketidak seimbangan kelas. Penelitian ini mengusulkan metode Random Forest yang dikombinasikan dengan Adaboost dapat mengestimasi tingkat kecacatan suatu Software melalui data training, Dari Hasil penelitian ini menunjukan bahwa algoritma Resampling+Adaboost+Random Forest dapat digunakan untuk memprediksi cacat software dengan rata-rata akurasi 94,70% dan nilai AUC 0,939. Sementara algoritma PSO+Random Forest hanya memiliki rata-rata akurasi 89,60% dan AUC 0,636 perbedaan akurasi dari kedua model tersebut 5,10% dan AUC 0,303. Uji statistik menunjukan bahwa adanya pengaruh yang signifikan antara model usulan dengan model Random Forest dengan nilai p (0,036) lebih kecil dari nilai alpha (0,05) yang artinya terdapat perbedaan yang siginifkan antara kedua model.Kata kunci: Imbalanced Class, Resample, Particle Swarm Optimization, Random Forest, Adaboost, Kecacatan Software


Sign in / Sign up

Export Citation Format

Share Document