Software maintainability prediction by data mining of software code metrics

Author(s):  
Arvinder Kaur ◽  
Kamaldeep Kaur ◽  
Kaushal Pathak
2021 ◽  
Author(s):  
Sedef Akinli Koçak

In recent years, a significant amount of energy consumption of ICT products has resulted in environmental concerns. Growing demand for mobile devices, personal computers, and the widespread adaptation of cloud computing and data centers are the main drivers for the energy consumption of the ICT systems. Finding solutions for improving the energy efficiency of the systems has become an important objective for both industry and academia. In order to address the increase in ICT energy consumption, hardware technology, such as production of energy efficient processors, has been substantially improved. However, demand for energy is growing faster than improvements are being made on these energy-aware technologies. Therefore, in addition to hardware, software technologies must also be a focus of research attention. Although software does not consume energy by itself, its characteristics determine which hardware resources are made available and how much electrical energy is used. Current literature on the energy efficiency of software, highlights, in particular, a lack of measurements and models. In this dissertation, first, the relationship between software code properties and energy consumption is explored. Second, using static code metrics regression based energy consumption prediction models are investigated. Finally, the models performance are assessed using within product and cross-product energy consumption prediction approaches. For this purpose, a quantitative based retrospective cohort study was employed. As research methods, observational data collection, mining software repositories, and regression analysis were utilized. This research results show inconsistent relationships between energy consumption and code size and complexity attributes considering different types of software products. Such results provide a foundation of knowledge that static code attributes may give some insights but would not be the sole predictors of energy consumption of software products.


2013 ◽  
Vol 7 ◽  
pp. 336-343 ◽  
Author(s):  
Alberto Núñez-Varela ◽  
Hector G. Perez-Gonzalez ◽  
Juan Carlos Cuevas-Tello ◽  
Carlos Soubervielle-Montalvo
Keyword(s):  

2020 ◽  
Vol 8 (6) ◽  
pp. 2403-2408

The research paper developed a new software metric methodology for evaluating the analyzability indicator for software products. The proposed research methodology provided an objective and quantitative assessment in accordance with the requirements, limitations, purpose and specific features of software products. Forty-one (41) java programs were analyzed to extract and evaluate the software metrics described in ‘Halstead metrics. The mathematical classification model was developed to replace the expert output in the evaluating process as related to the software metric indicators. The output of the algorithm was applied to identify the metrics with the greatest analyzability influence. The result indicated that 13 measured metrics with 98% of “analyzability” are relevant to seven (7) software code metrics with the remaining six (6) metrics making up only ~ 5% of “analyzability”. The analyzed ROC-curves were similarly computed to test the performance of the proposed methodology compared to the expert’s metric evaluation. The ROC-curves indicator for the proposed methodology showed resultant scores of ROC = 7.4 as compared to 7.3 from the experts’ evaluation. However, both methods were correlated effectively after analytical computations with a resultant performance which showed that the proposed method outperforms the expert’s evaluation.


2021 ◽  
Vol 4 (4) ◽  
pp. 354-365
Author(s):  
Vitaliy S. Yakovyna ◽  
◽  
Ivan I. Symets

This article is focused on improving static models of software reliability based on using machine learning methods to select the software code metrics that most strongly affect its reliability. The study used a merged dataset from the PROMISE Software Engineering repository, which contained data on testing software modules of five programs and twenty-one code metrics. For the prepared sampling, the most important features that affect the quality of software code have been selected using the following methods of feature selection: Boruta, Stepwise selection, Exhaustive Feature Selection, Random Forest Importance, LightGBM Importance, Genetic Algorithms, Principal Component Analysis, Xverse python. Basing on the voting on the results of the work of the methods of feature selection, a static (deterministic) model of software reliability has been built, which establishes the relationship between the probability of a defect in the software module and the metrics of its code. It has been shown that this model includes such code metrics as branch count of a program, McCabe’s lines of code and cyclomatic complexity, Halstead’s total number of operators and operands, intelligence, volume, and effort value. A comparison of the effectiveness of different methods of feature selection has been put into practice, in particular, a study of the effect of the method of feature selection on the accuracy of classification using the following classifiers: Random Forest, Support Vector Machine, k-Nearest Neighbors, Decision Tree classifier, AdaBoost classifier, Gradient Boosting for classification. It has been shown that the use of any method of feature selection increases the accuracy of classification by at least ten percent compared to the original dataset, which confirms the importance of this procedure for predicting software defects based on metric datasets that contain a significant number of highly correlated software code metrics. It has been found that the best accuracy of the forecast for most classifiers was reached using a set of features obtained from the proposed static model of software reliability. In addition, it has been shown that it is also possible to use separate methods, such as Autoencoder, Exhaustive Feature Selection and Principal Component Analysis with an insignificant loss of classification and prediction accuracy


Software Engineering has its origins in tackling the issue of development and maintenance of quality software. Software Quality has been defined in multiple ways but the broadest definition is that quality is the extent to which the customer is satisfied with the developed software. Data mining has the prospects of being applied to multiple domains and addressing the long standing issues faced by them. It has been successfully applied to uncover solutions to complex problems that have long confronted these domains. The proposed research is a step in the direction. It will attempt to apply existing data mining algorithms to data accumulated by software organizations in an attempt to extract useful patterns that can go a long way in addressing the issue of software quality. This work proposed Spacious Virtue Suggestion (SVS) Model for analyzing code based quality in software quality model. The first layer of this model is Extraction Layer that extracts the various attributes of software code used. After the extraction of the metrics attributes are constructed as a vector is considered as the feature vector for the second layer of the SVS Model. The second layer of the SVS model is Selection Layer which employs feature selection strategy to obtain significant metrics attributes for the software quality prediction by reducing the overlapping metrics attributes from the vector... The third layer of SVS Model is Prediction Layer which predict the good class from the training set and result shows the high accuracy in the proposed system.


2021 ◽  
Author(s):  
Sedef Akinli Koçak

In recent years, a significant amount of energy consumption of ICT products has resulted in environmental concerns. Growing demand for mobile devices, personal computers, and the widespread adaptation of cloud computing and data centers are the main drivers for the energy consumption of the ICT systems. Finding solutions for improving the energy efficiency of the systems has become an important objective for both industry and academia. In order to address the increase in ICT energy consumption, hardware technology, such as production of energy efficient processors, has been substantially improved. However, demand for energy is growing faster than improvements are being made on these energy-aware technologies. Therefore, in addition to hardware, software technologies must also be a focus of research attention. Although software does not consume energy by itself, its characteristics determine which hardware resources are made available and how much electrical energy is used. Current literature on the energy efficiency of software, highlights, in particular, a lack of measurements and models. In this dissertation, first, the relationship between software code properties and energy consumption is explored. Second, using static code metrics regression based energy consumption prediction models are investigated. Finally, the models performance are assessed using within product and cross-product energy consumption prediction approaches. For this purpose, a quantitative based retrospective cohort study was employed. As research methods, observational data collection, mining software repositories, and regression analysis were utilized. This research results show inconsistent relationships between energy consumption and code size and complexity attributes considering different types of software products. Such results provide a foundation of knowledge that static code attributes may give some insights but would not be the sole predictors of energy consumption of software products.


Sign in / Sign up

Export Citation Format

Share Document