Search of Similar Programs Using Code Metrics and Big Data-Based Assessment of Software Reliability

Author(s):  
Svitlana Yaremchuck ◽  
Vyacheslav Kharchenko ◽  
Anatoliy Gorbenko
2021 ◽  
Vol 4 (4) ◽  
pp. 354-365
Author(s):  
Vitaliy S. Yakovyna ◽  
◽  
Ivan I. Symets

This article is focused on improving static models of software reliability based on using machine learning methods to select the software code metrics that most strongly affect its reliability. The study used a merged dataset from the PROMISE Software Engineering repository, which contained data on testing software modules of five programs and twenty-one code metrics. For the prepared sampling, the most important features that affect the quality of software code have been selected using the following methods of feature selection: Boruta, Stepwise selection, Exhaustive Feature Selection, Random Forest Importance, LightGBM Importance, Genetic Algorithms, Principal Component Analysis, Xverse python. Basing on the voting on the results of the work of the methods of feature selection, a static (deterministic) model of software reliability has been built, which establishes the relationship between the probability of a defect in the software module and the metrics of its code. It has been shown that this model includes such code metrics as branch count of a program, McCabe’s lines of code and cyclomatic complexity, Halstead’s total number of operators and operands, intelligence, volume, and effort value. A comparison of the effectiveness of different methods of feature selection has been put into practice, in particular, a study of the effect of the method of feature selection on the accuracy of classification using the following classifiers: Random Forest, Support Vector Machine, k-Nearest Neighbors, Decision Tree classifier, AdaBoost classifier, Gradient Boosting for classification. It has been shown that the use of any method of feature selection increases the accuracy of classification by at least ten percent compared to the original dataset, which confirms the importance of this procedure for predicting software defects based on metric datasets that contain a significant number of highly correlated software code metrics. It has been found that the best accuracy of the forecast for most classifiers was reached using a set of features obtained from the proposed static model of software reliability. In addition, it has been shown that it is also possible to use separate methods, such as Autoencoder, Exhaustive Feature Selection and Principal Component Analysis with an insignificant loss of classification and prediction accuracy


Author(s):  
Ranjan Kumar ◽  
Subhash Kumar ◽  
Sanjay K. Tiwari

Author(s):  
Yoshinobu Tamura ◽  
Tomoya Takeuchi ◽  
Shigeru Yamada

At present, the cloud computing with big data is known as a next-generation software service paradigm. However, the effective methods of software reliability analysis considering the big data and cloud computing have been only few presented. In particular, it is important to consider the optimal data partitioning in terms of cloud computing with big data. Considering the cloud computing with big data, it will be useful for the software managers to estimate the total software cost in order to make allocations the optimal data area to the cloud user. We propose the method of component-oriented reliability assessment based on neural network in order to the optimal data partitioning for cloud computing with big data in this paper. Moreover, we propose the method of system-wide reliability assessment based on the jump diffusion process model considering the big data on cloud computing. Furthermore, we propose the optimal maintenance problem based on the jump diffusion model. Considering the contract cost for the maximum number of subscriber as the cloud user, we find the optimum maintenance time by minimizing the total software cost.


ASHA Leader ◽  
2013 ◽  
Vol 18 (2) ◽  
pp. 59-59
Keyword(s):  

Find Out About 'Big Data' to Track Outcomes


2014 ◽  
Vol 35 (3) ◽  
pp. 158-165 ◽  
Author(s):  
Christian Montag ◽  
Konrad Błaszkiewicz ◽  
Bernd Lachmann ◽  
Ionut Andone ◽  
Rayna Sariyska ◽  
...  

In the present study we link self-report-data on personality to behavior recorded on the mobile phone. This new approach from Psychoinformatics collects data from humans in everyday life. It demonstrates the fruitful collaboration between psychology and computer science, combining Big Data with psychological variables. Given the large number of variables, which can be tracked on a smartphone, the present study focuses on the traditional features of mobile phones – namely incoming and outgoing calls and SMS. We observed N = 49 participants with respect to the telephone/SMS usage via our custom developed mobile phone app for 5 weeks. Extraversion was positively associated with nearly all related telephone call variables. In particular, Extraverts directly reach out to their social network via voice calls.


Sign in / Sign up

Export Citation Format

Share Document