Software Defect Prediction via Attention-Based Recurrent Neural Network

In order to improve software reliability, software defect prediction is applied to the process of software maintenance to identify potential bugs. Traditional methods of software defect prediction mainly focus on designing static code metrics, which are input into machine learning classifiers to predict defect probabilities of the code. However, the characteristics of these artificial metrics do not contain the syntactic structures and semantic information of programs. Such information is more significant than manual metrics and can provide a more accurate predictive model. In this paper, we propose a framework called defect prediction via attention-based recurrent neural network (DP-ARNN). More specifically, DP-ARNN first parses abstract syntax trees (ASTs) of programs and extracts them as vectors. Then it encodes vectors which are used as inputs of DP-ARNN by dictionary mapping and word embedding. After that, it can automatically learn syntactic and semantic features. Furthermore, it employs the attention mechanism to further generate significant features for accurate defect prediction. To validate our method, we choose seven open-source Java projects in Apache, using F1-measure and area under the curve (AUC) as evaluation criteria. The experimental results show that, in average, DP-ARNN improves the F1-measure by 14% and AUC by 7% compared with the state-of-the-art methods, respectively.

Download Full-text

Software Defect Prediction Using SMOTE and Artificial Neural Network

10.1109/icodse53690.2021.9648476 ◽

2021 ◽

Author(s):

Wisnu Arya Dipa ◽

Wikan Danar Sunindyo

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Artificial Neural

Download Full-text

Software Defect Prediction Using a Hybrid Model Based on Semantic Features Learned from the Source Code

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-030-29551-6_23 ◽

2019 ◽

pp. 262-274

Author(s):

Diana-Lucia Miholca ◽

Gabriela Czibula

Keyword(s):

Hybrid Model ◽

Source Code ◽

Defect Prediction ◽

Semantic Features ◽

Software Defect Prediction ◽

Model Based ◽

Software Defect

Download Full-text

Software Defect Prediction Tool based on Neural Network

International Journal of Computer Applications ◽

10.5120/12200-8368 ◽

2013 ◽

Vol 70 (22) ◽

pp. 22-28 ◽

Cited By ~ 7

Author(s):

Malkit Singh ◽

Dalwinder Singh Salaria

Keyword(s):

Neural Network ◽

Defect Prediction ◽

Prediction Tool ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Learning Semantic Features for Software Defect Prediction by Code Comments Embedding

2018 IEEE International Conference on Data Mining (ICDM) ◽

10.1109/icdm.2018.00133 ◽

2018 ◽

Cited By ~ 1

Author(s):

Xuan Huo ◽

Yang Yang ◽

Ming Li ◽

De-Chuan Zhan

Keyword(s):

Defect Prediction ◽

Semantic Features ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Using the Support Vector Machine as a Classification Method for Software Defect Prediction with Static Code Metrics

Engineering Applications of Neural Networks - Communications in Computer and Information Science ◽

10.1007/978-3-642-03969-0_21 ◽

2009 ◽

pp. 223-234 ◽

Cited By ~ 33

Author(s):

David Gray ◽

David Bowes ◽

Neil Davey ◽

Yi Sun ◽

Bruce Christianson

Keyword(s):

Support Vector Machine ◽

Classification Method ◽

Defect Prediction ◽

Support Vector ◽

Software Defect Prediction ◽

Software Defect ◽

Code Metrics

Download Full-text

Mengatasi Imbalanced Class Pada Software Defect Prediction Menggunakan Two-Step Clustering-Based Undersampling dan Bagging Tehcnique

Jurnal Informatika ◽

10.31311/ji.v6i1.5448 ◽

2019 ◽

Vol 6 (1) ◽

pp. 107-113

Author(s):

Muhammad Faittullah Akbar ◽

Ilham Kurniawan ◽

Ahmad Fauzi

Keyword(s):

Machine Learning ◽

Data Mining ◽

Area Under The Curve ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Random Undersampling ◽

Imbalanced Class ◽

Program Data

Ketidakseimbangan kelas seringkali menjadi masalah di berbagai set data dunia nyata, di mana satu kelas (yaitu kelas minoritas) berisi sejumlah kecil titik data dan yang lainnya (yaitu kelas mayoritas) berisi sejumlah besar titik data. Sangat sulit untuk mengembangkan model yang efektif dengan menggunakan data mining dan algoritma machine learning tanpa mempertimbangkan preprocessing data untuk menyeimbangkan set data yang tidak seimbang. Random undersampling dan oversampling telah digunakan dalam banyak penelitian untuk memastikan bahwa kelas yang berbeda mengandung jumlah titik data yang sama. Dalam penelitian ini, kami mengusulkan kombinasi two-step clustering-based random undersampling dan bagging technique untuk meningkatkan nilai akurasi software defect prediction. Metode yang diusulkan dievaluasi menggunakan lima set data dari repositori program data metrik NASA dan area under the curve (AUC) sebagai evaluasi utama. Hasil telah menunjukkan bahwa metode yang diusulkan menghasilkan kinerja yang sangat baik untuk semua dataset (AUC> 0,9). Dalam hal SN, percobaan kedua mengungguli percobaan pertama di hampir semua dataset (3 dari 5 dataset). Sementara itu, dalam hal SP, percobaan pertama tidak mengungguli percobaan kedua di semua dataset. Secara keseluruhan percobaan kedua mengungguli dan lebih baik daripada percobaan pertama karena evaluasi utama dalam klasifikasi kelas yang tidak seimbang seperti SDP adalah AUC Oleh karena itu, dapat disimpulkan bahwa metode yang diusulkan menghasilkan kinerja yang optimal baik untuk set data skala kecil maupun besar.

Download Full-text

COMPARATIVE ANALYSIS OF CLASSIFICATION METHODS FOR PREDICTION SOFTWARE FAULT PRONENESS USING PROCESS METRICS

10.36227/techrxiv.16586354.v1 ◽

2021 ◽

Author(s):

Anjali Bansal

Keyword(s):

Prediction Performance ◽

Defect Prediction ◽

Software Defect Prediction ◽

Classification Methods ◽

Software Defect ◽

Ensemble Techniques ◽

Code Metrics ◽

Independent Variable ◽

Software Fault ◽

Process Metrics

As we all know a lot of research has been done in the field of software defect prediction but most of them uses static code metrics as their independent variable. In this paper the main objective is to analyze the effect of process metrics on prediction performance using various classification and ensemble techniques. Also in this i have used both AUC and MCC measure to analyze the results. We can conclude that process metrics are as effective as static code metrics.

Download Full-text