Software Defect Fault Intelligent Location and Identification Method Based on Data Mining

Abstract With the advancement of the times, computer technology is also constantly improving, and people’s requirements for software functions are also constantly improving, and as software functions become more and more complex, developers are technically limited and teamwork is not tacitly coordinated. And so on, so in the software development process, some errors and problems will inevitably lead to software defects. The purpose of this paper is to study the intelligent location and identification methods of software defects based on data mining. This article first studies the domestic and foreign software defect fault intelligent location technology, analyzes the shortcomings of traditional software defect detection and fault detection, then introduces data mining technology in detail, and finally conducts in-depth research on software defect prediction technology. Through in-depth research on several technologies, it reduces the accidents of software equipment and delays its service life. According to the experiments in this article, the software defect location proposed in this article uses two methods to compare. The first error set is used as a unit to measure the subsequent error set software error location cost. The first error set 1F contains 19 A manually injected error program, and the average positioning cost obtained is 3.75%.

Download Full-text

Software Defect Distribution Prediction for BOSS System

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.701-702.67 ◽

2014 ◽

Vol 701-702 ◽

pp. 67-70

Author(s):

Wan Jiang Han ◽

He Yang Jiang ◽

Yi Sun ◽

Tian Bo Lu

Keyword(s):

Software Development ◽

Model Experiment ◽

Development Process ◽

Distribution Model ◽

Defect Prediction ◽

Software Development Process ◽

Software Defects ◽

Defect Distribution ◽

Software Defect ◽

Important Activity

Effective detection of software defects is an important activity of software development process. In this paper, we propose an approach to predict residual defects for BOSS project, which applies defect distribution model. Experiment results show that this approach can effectively improve the accuracy of defect prediction.

Download Full-text

DATA MINING FOR THE MANAGEMENT OF SOFTWARE DEVELOPMENT PROCESS

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194004001841 ◽

2004 ◽

Vol 14 (06) ◽

pp. 665-695 ◽

Cited By ~ 6

Author(s):

J. L. ÁLVAREZ-MACÍAS ◽

J. MATA-VÁZQUEZ ◽

J. C. RIQUELME-SANTOS

Keyword(s):

Data Mining ◽

Software Development ◽

Unsupervised Learning ◽

Supervised Learning ◽

Development Process ◽

A Priori ◽

Post Mortem ◽

Software Development Process ◽

A Priori Analysis ◽

Mining Tools

In this paper we present a new method for the application of data mining tools on the management phase of software development process. Specifically, we describe two tools, the first one based on supervised learning, and the second one on unsupervised learning. The goal of this method is to induce a set of management rules that make easy the development process to the managers. Depending on how and to what is this method applied, it will permit an a priori analysis, a monitoring of the project or a post-mortem analysis.

Download Full-text

Application of Data Mining Techniques for Improving Continuous Integration

International Journal of Engineering and Management Research ◽

10.31033/ijemr.8.5.4 ◽

2018 ◽

Vol 8 (5) ◽

Author(s):

Meenakshi Kathayat

Keyword(s):

Data Mining ◽

Development Process ◽

Software Development Process ◽

Data Mining Techniques ◽

Useful Knowledge ◽

Continuous Integration ◽

Integration Data ◽

Integration Problems ◽

Work Done ◽

Integration Errors

Continuous integration is a software development process where members of a team frequently integrate the work done by them. Generally each person integrates at least daily - leading to multiple integrations per day. Integration done by each developer is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach reduces integration problems and allows a team to develop cohesive software rapidly. Continuous Integration doesn’t remove bugs, but it does make them dramatically easier to find and remove. This paper provides an overview of various issues regarding Continuous Integration and how various data mining techniques can be applied in continuous integration data for extracting useful knowledge and solving continuousintegration problems.

Download Full-text

Software Defect Prediction in Class Level Metric Aggregation Using Data Mining Techniques

Research Journal of Applied Sciences Engineering and Technology ◽

10.19026/rjaset.13.3014 ◽

2016 ◽

Vol 13 (7) ◽

pp. 544-568

Author(s):

Reddi Kiran Kumar ◽

S.V. Achuta Rao

Keyword(s):

Data Mining ◽

Defect Prediction ◽

Software Defect Prediction ◽

Data Mining Techniques ◽

Software Defect ◽

Class Level ◽

Using Data

Download Full-text

Mengatasi Imbalanced Class Pada Software Defect Prediction Menggunakan Two-Step Clustering-Based Undersampling dan Bagging Tehcnique

Jurnal Informatika ◽

10.31311/ji.v6i1.5448 ◽

2019 ◽

Vol 6 (1) ◽

pp. 107-113

Author(s):

Muhammad Faittullah Akbar ◽

Ilham Kurniawan ◽

Ahmad Fauzi

Keyword(s):

Machine Learning ◽

Data Mining ◽

Area Under The Curve ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Random Undersampling ◽

Imbalanced Class ◽

Program Data

Ketidakseimbangan kelas seringkali menjadi masalah di berbagai set data dunia nyata, di mana satu kelas (yaitu kelas minoritas) berisi sejumlah kecil titik data dan yang lainnya (yaitu kelas mayoritas) berisi sejumlah besar titik data. Sangat sulit untuk mengembangkan model yang efektif dengan menggunakan data mining dan algoritma machine learning tanpa mempertimbangkan preprocessing data untuk menyeimbangkan set data yang tidak seimbang. Random undersampling dan oversampling telah digunakan dalam banyak penelitian untuk memastikan bahwa kelas yang berbeda mengandung jumlah titik data yang sama. Dalam penelitian ini, kami mengusulkan kombinasi two-step clustering-based random undersampling dan bagging technique untuk meningkatkan nilai akurasi software defect prediction. Metode yang diusulkan dievaluasi menggunakan lima set data dari repositori program data metrik NASA dan area under the curve (AUC) sebagai evaluasi utama. Hasil telah menunjukkan bahwa metode yang diusulkan menghasilkan kinerja yang sangat baik untuk semua dataset (AUC> 0,9). Dalam hal SN, percobaan kedua mengungguli percobaan pertama di hampir semua dataset (3 dari 5 dataset). Sementara itu, dalam hal SP, percobaan pertama tidak mengungguli percobaan kedua di semua dataset. Secara keseluruhan percobaan kedua mengungguli dan lebih baik daripada percobaan pertama karena evaluasi utama dalam klasifikasi kelas yang tidak seimbang seperti SDP adalah AUC Oleh karena itu, dapat disimpulkan bahwa metode yang diusulkan menghasilkan kinerja yang optimal baik untuk set data skala kecil maupun besar.

Download Full-text

Investigation of Software Defect Prediction Using Data Mining Framework

Research Journal of Applied Sciences Engineering and Technology ◽

10.19026/rjaset.11.1676 ◽

2015 ◽

Vol 11 (1) ◽

pp. 63-69 ◽

Cited By ~ 1

Author(s):

M. Anbu ◽

G.S. Anandha Mala

Keyword(s):

Data Mining ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Using Data

Download Full-text

SOFTWARE WAREHOUSE: ITS DESIGN, MANAGEMENT AND APPLICATION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194004001683 ◽

2004 ◽

Vol 14 (04) ◽

pp. 395-406 ◽

Cited By ~ 2

Author(s):

HONGHUA DAI ◽

WEI DAI ◽

GANG LI

Keyword(s):

Data Mining ◽

Software Engineering ◽

Software Development ◽

Development Process ◽

Innovative Approach ◽

Design Management ◽

Software Development Process ◽

Software Applications ◽

High Efficient ◽

Efficient Software

To have an effective and efficient mechanism to store, manage and utilize software sources is essential to the automation of software engineering. The paper presents an innovative approach in managing software resources using software warehouse where software assets are systematically accumulated, deposited, retrieved, packaged, managed and utilized, driven by data-mining and OLAP technologies. The results lead to streamlined high efficient software development process and enhance the productivity in response to modern challenges of the design and development of software applications.

Download Full-text

A Comparison of Software Defect Prediction Metrics Using Data Mining Algorithms

Journal of Innovative Science and Engineering (JISE) ◽

10.38088/jise.693098 ◽

2020 ◽

pp. 11-21

Author(s):

Zeynep Behrin GÜVEN AYDIN ◽

Rüya ŞAMLI

Keyword(s):

Data Mining ◽

Defect Prediction ◽

Software Defect Prediction ◽

Data Mining Algorithms ◽

Software Defect ◽

Using Data ◽

Mining Algorithms

Download Full-text

An Empirical Study of the Software Development Process, Including Its Requirements Engineering, at Very Large Organization: How to Use Data Mining in Such a Study

Communications in Computer and Information Science - Requirements Engineering for Internet of Things ◽

10.1007/978-981-10-7796-8_2 ◽

2018 ◽

pp. 15-25

Author(s):

Colin M. Werner ◽

Daniel M. Berry

Keyword(s):

Data Mining ◽

Empirical Study ◽

Software Development ◽

Requirements Engineering ◽

Development Process ◽

Software Development Process ◽

Large Organization

Download Full-text

Software Defect Prediction Using AWEIG+ADACOST Bayesian Algorithm for Handling High Dimensional Data and Class Imbalance Problem

International Journal of Information Technology and Business ◽

10.24246/ijiteb.112018.36-41 ◽

2018 ◽

Vol 1 (1) ◽

pp. 36-41

Author(s):

Joko Suntoro ◽

Febrian Wahyu Christanto ◽

Henny Indriyawati

Keyword(s):

Naive Bayes ◽

High Dimensional Data ◽

Naïve Bayes ◽

High Dimensional ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defects ◽

Bayesian Algorithm ◽

Software Defect ◽

Bayes Algorithm

The most important part in software engineering is a software defect prediction. Software defect prediction is defined as a software prediction process from errors, failures, and system errors. Machine learning methods are used by researchers to predict software defects including estimation, association, classification, clustering, and datasets analysis. Datasets of NASA Metrics Data Program (NASA MDP) is one of the metric software that researchers use to predict software defects. NASA MDP datasets contain unbalanced classes and high dimensional data, so they will affect the classification evaluation results to be low. In this research, data with unbalanced classes will be solved by the AdaCost method and high dimensional data will be handled with the Average Weight Information Gain (AWEIG) method, while the classification method that will be used is the Naïve Bayes algorithm. The proposed method is named AWEIG + AdaCost Bayesian. In this experiment, the AWEIG + AdaCost Bayesian algorithm is compared to the Naïve Bayesian algorithm. The results showed the mean of Area Under the Curve (AUC) algorithm AWEIG + AdaCost Bayesian yields better than just a Naïve Bayes algorithm with respectively mean of AUC values are 0.752 and 0.696.

Download Full-text