scholarly journals Empirical Evaluation of Mimic Software Project Data Sets for Software Effort Estimation

2020 ◽  
Vol E103.D (10) ◽  
pp. 2094-2103
Author(s):  
Maohua GAN ◽  
Zeynep YÜCEL ◽  
Akito MONDEN ◽  
Kentaro SASAKI
Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1195
Author(s):  
Priya Varshini A G ◽  
Anitha Kumari K ◽  
Vijayakumar Varadarajan

Software Project Estimation is a challenging and important activity in developing software projects. Software Project Estimation includes Software Time Estimation, Software Resource Estimation, Software Cost Estimation, and Software Effort Estimation. Software Effort Estimation focuses on predicting the number of hours of work (effort in terms of person-hours or person-months) required to develop or maintain a software application. It is difficult to forecast effort during the initial stages of software development. Various machine learning and deep learning models have been developed to predict the effort estimation. In this paper, single model approaches and ensemble approaches were considered for estimation. Ensemble techniques are the combination of several single models. Ensemble techniques considered for estimation were averaging, weighted averaging, bagging, boosting, and stacking. Various stacking models considered and evaluated were stacking using a generalized linear model, stacking using decision tree, stacking using a support vector machine, and stacking using random forest. Datasets considered for estimation were Albrecht, China, Desharnais, Kemerer, Kitchenham, Maxwell, and Cocomo81. Evaluation measures used were mean absolute error, root mean squared error, and R-squared. The results proved that the proposed stacking using random forest provides the best results compared with single model approaches using the machine or deep learning algorithms and other ensemble techniques.


2015 ◽  
Vol 2015 ◽  
pp. 1-5
Author(s):  
Senthil Kumar Murugesan ◽  
Chidhambara Rajan Balasubramanian

Software companies are now keen to provide secure software with respect to accuracy and reliability of their products especially related to the software effort estimation. Therefore, there is a need to develop a hybrid tool which provides all the necessary features. This paper attempts to propose a hybrid estimator algorithm and model which incorporates quality metrics, reliability factor, and the security factor with a fuzzy-based function point analysis. Initially, this method utilizes a fuzzy-based estimate to control the uncertainty in the software size with the help of a triangular fuzzy set at the early development stage. Secondly, the function point analysis is extended by the security and reliability factors in the calculation. Finally, the performance metrics are added with the effort estimation for accuracy. The experimentation is done with different project data sets on the hybrid tool, and the results are compared with the existing models. It shows that the proposed method not only improves the accuracy but also increases the reliability, as well as the security, of the product.


Author(s):  
Fatih Yücalar ◽  
Deniz Kilinc ◽  
Emin Borandag ◽  
Akin Ozcift

Estimating the development effort of a software project in the early stages of the software life cycle is a significant task. Accurate estimates help project managers to overcome the problems regarding budget and time overruns. This paper proposes a new multiple linear regression analysis based effort estimation method, which has brought a different perspective to the software effort estimation methods and increased the success of software effort estimation processes. The proposed method is compared with standard Use Case Point (UCP) method, which is a well-known method in this area, and simple linear regression based effort estimation method developed by Nassif et al. In order to evaluate and compare the proposed method, the data of 10 software projects developed by four well-established software companies in Turkey were collected and datasets were created. When effort estimations obtained from datasets and actual efforts spent to complete the projects are compared with each other, it has been observed that the proposed method has higher effort estimation accuracy compared to the other methods.


2018 ◽  
Vol 232 ◽  
pp. 03017
Author(s):  
Jie Zhang ◽  
Gang Wang ◽  
Haobo Jiang ◽  
Fangzheng Zhao ◽  
Guilin Tian

Software Defect Prediction has been an important part of Software engineering research since the 1970s. This technique is used to calculate and analyze the measurement and defect information of the historical software module to complete the defect prediction of the new software module. Currently, most software defect prediction model is established on the basis of the same software project data set. The training date sets used to construct the model and the test data sets used to validate the model are from the same software projects. But in practice, for those has less historical data of a software project or new projects, the defect of traditional prediction method shows lower forecast performance. For the traditional method, when the historical data is insufficient, the software defect prediction model cannot be fully studied. It is difficult to achieve high prediction accuracy. In the process of cross-project prediction, the problem that we will faced is data distribution differences. For the above problems, this paper presents a software defect prediction model based on migration learning and traditional software defect prediction model. This model uses the existing project data sets to predict software defects across projects. The main work of this article includes: 1) Data preprocessing. This section includes data feature correlation analysis, noise reduction and so on, which effectively avoids the interference of over-fitting problem and noise data on prediction results. 2) Migrate learning. This section analyzes two different but related project data sets and reduces the impact of data distribution differences. 3) Artificial neural networks. According to class imbalance problems of the data set, using artificial neural network and dynamic selection training samples reduce the influence of prediction results because of the positive and negative samples data. The data set of the Relink project and AEEEM is studied to evaluate the performance of the f-measure and the ROC curve and AUC calculation. Experiments show that the model has high predictive performance.


Author(s):  
SUMANTH YENDURI ◽  
S. S. IYENGAR

In this study, we compare the performance of four different imputation strategies ranging from the commonly used Listwise Deletion to model based approaches such as the Maximum Likelihood on enhancing completeness in incomplete software project data sets. We evaluate the impact of each of these methods by implementing them on six different real-time software project data sets which are classified into different categories based on their inherent properties. The reliability of the constructed data sets using these techniques are further tested by building prediction models using stepwise regression. The experimental results are noted and the findings are finally discussed.


Author(s):  
FATIMA AZZAHRA AMAZAL ◽  
ALI IDRI ◽  
ALAIN ABRAN

Software effort estimation is one of the most important tasks in software project management. Of several techniques suggested for estimating software development effort, the analogy-based reasoning, or Case-Based Reasoning (CBR), approaches stand out as promising techniques. In this paper, the benefits of using linguistic rather than numerical values in the analogy process for software effort estimation are investigated. The performance, in terms of accuracy and tolerance of imprecision, of two analogy-based software effort estimation models (Classical Analogy and Fuzzy Analogy, which use numerical and linguistic values respectively to describe software projects) is compared. Three research questions related to the performance of these two models are discussed and answered. This study uses the International Software Benchmarking Standards Group (ISBSG) dataset and confirms the usefulness of using linguistic instead of numerical values in analogy-based software effort estimation models.


Sign in / Sign up

Export Citation Format

Share Document