Feature Representation and Feature Matching for Heterogeneous Defect Prediction

The heterogeneous defect prediction (HDP) technique can predict defects in a target company using heterogeneous metric data from external company, which has received substantial research attention. However, existing HDP methods assume that source data is labeled but labeling data is expensive. Semi-supervised defect prediction technique can perform defect prediction with few labeled data. In this paper, we investigate a new problem — semi-supervised HDP (SHDP). To solve this problem, we propose a new approach named cost-sensitive kernel semi-supervised correlation analysis (CKSCA) as a solution of SHDP problem. It introduces unified metric representation and canonical correlation analysis to make the data distributions of different company projects more similar. CKSCA also designs a cost-sensitive kernel semi-supervised discriminant analysis mechanism to utilize the limited labeled data and sufficient real-life unlabeled data from different companies. Besides we collect lots of open-source projects from GitHub website to construct a new large-scale unlabeled dataset called GITHUB dataset. It contains 26,407 modules and is greater than each public project dataset. It has been public online and can be extended continuously. Experiments on the GITHUB dataset and other public datasets indicate that unlabeled GITHUB data can help prediction model improve prediction performance, and CKSCA is effective and efficient for solving SHDP problem.

Download Full-text

Heterogeneous Defect Prediction Using Ensemble Learning Technique

Advances in Intelligent Systems and Computing - Artificial Intelligence and Evolutionary Computations in Engineering Systems ◽

10.1007/978-981-15-0199-9_25 ◽

2020 ◽

pp. 283-293

Author(s):

Arsalan Ahmed Ansari ◽

Amaan Iqbal ◽

Bibhudatta Sahoo

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Learning Technique ◽

Heterogeneous Defect Prediction

Download Full-text

Kernel Spectral Embedding Transfer Ensemble for Heterogeneous Defect Prediction

IEEE Transactions on Software Engineering ◽

10.1109/tse.2019.2939303 ◽

2019 ◽

pp. 1-1 ◽

Cited By ~ 2

Author(s):

Haonan Tong ◽

Bin Liu ◽

Shihai Wang

Keyword(s):

Defect Prediction ◽

Spectral Embedding ◽

Heterogeneous Defect Prediction

Download Full-text

Heterogeneous Defect Prediction

IEEE Transactions on Software Engineering ◽

10.1109/tse.2017.2720603 ◽

2018 ◽

Vol 44 (9) ◽

pp. 874-896 ◽

Cited By ~ 48

Author(s):

Jaechang Nam ◽

Wei Fu ◽

Sunghun Kim ◽

Tim Menzies ◽

Lin Tan

Keyword(s):

Defect Prediction ◽

Heterogeneous Defect Prediction

Download Full-text

Analysis of Feature Extraction and Anti-Interference of Face Image under Deep Reconstruction Network Algorithm

Complexity ◽

10.1155/2021/8391973 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Jin Yang ◽

Yuxuan Zhao ◽

Shihao Yang ◽

Xinxin Kang ◽

Xinyan Cao ◽

...

Keyword(s):

Feature Extraction ◽

Face Recognition ◽

Bayesian Method ◽

Feature Matching ◽

Facial Feature ◽

Face Image ◽

Feature Representation ◽

Image Feature ◽

Training Time ◽

Network Algorithm

In face recognition systems, highly robust facial feature representation and good classification algorithm performance can affect the effect of face recognition under unrestricted conditions. To explore the anti-interference performance of convolutional neural network (CNN) reconstructed by deep learning (DL) framework in face image feature extraction (FE) and recognition, in the paper, first, the inception structure in the GoogleNet network and the residual error in the ResNet network structure are combined to construct a new deep reconstruction network algorithm, with the random gradient descent (SGD) and triplet loss functions as the model optimizer and classifier, respectively, and it is applied to the face recognition in Labeled Faces in the Wild (LFW) face database. Then, the portrait pyramid segmentation and local feature point segmentation are applied to extract the features of face images, and the matching of face feature points is achieved using Euclidean distance and joint Bayesian method. Finally, Matlab software is used to simulate the algorithm proposed in this paper and compare it with other algorithms. The results show that the proposed algorithm has the best face recognition effect when the learning rate is 0.0004, the attenuation coefficient is 0.0001, the training method is SGD, and dropout is 0.1 (accuracy: 99.03%, loss: 0.0047, training time: 352 s, and overfitting rate: 1.006), and the algorithm proposed in this paper has the largest mean average precision compared to other CNN algorithms. The correct rate of face feature matching of the algorithm proposed in this paper is 84.72%, which is higher than LetNet-5, VGG-16, and VGG-19 algorithms, the correct rates of which are 6.94%, 2.5%, and 1.11%, respectively, but lower than GoogleNet, AlexNet, and ResNet algorithms. At the same time, the algorithm proposed in this paper has a faster matching time (206.44 s) and a higher correct matching rate (88.75%) than the joint Bayesian method, indicating that the deep reconstruction network algorithm proposed in this paper can be used in face image recognition, FE, and matching, and it has strong anti-interference.

Download Full-text