scholarly journals COMPARATIVE ANALYSIS OF SOFTWARE EFFORT ESTIMATION USING DATA MINING TECHNIQUE AND FEATURE SELECTION

2021 ◽  
Vol 6 (2) ◽  
pp. 167-174
Author(s):  
Abdul Latif ◽  
Lady Agustin Fitriana ◽  
Muhammad Rifqi Firdaus

Software development involves several interrelated factors that influence development efforts and productivity. Improving the estimation techniques available to project managers will facilitate more effective time and budget control in software development. Software Effort Estimation or software cost/effort estimation can help a software development company to overcome difficulties experienced in estimating software development efforts. This study aims to compare the Machine Learning method of Linear Regression (LR), Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Decision Tree Random Forest (DTRF) to calculate estimated cost/effort software. Then these five approaches will be tested on a dataset of software development projects as many as 10 dataset projects. So that it can produce new knowledge about what machine learning and non-machine learning methods are the most accurate for estimating software business. As well as knowing between the selection between using Particle Swarm Optimization (PSO) for attributes selection and without PSO, which one can increase the accuracy for software business estimation. The data mining algorithm used to calculate the most optimal software effort estimate is the Linear Regression algorithm with an average RMSE value of 1603,024 for the 10 datasets tested. Then using the PSO feature selection can increase the accuracy or reduce the RMSE average value to 1552,999. The result indicates that, compared with the original regression linear model, the accuracy or error rate of software effort estimation has increased by 3.12% by applying PSO feature selection

Author(s):  
Jeremiah D. Deng ◽  
Martin Purvis ◽  
Maryam Purvis

Software development effort estimation is important for quality management in the software development industry, yet its automation still remains a challenging issue. Applying machine learning algorithms alone often cannot achieve satisfactory results. This paper presents an integrated data mining framework that incorporates domain knowledge into a series of data analysis and modeling processes, including visualization, feature selection, and model validation. An empirical study on the software effort estimation problem using a benchmark dataset shows the necessity and effectiveness of the proposed approach.


Effort estimation is a crucial step that leads to Duration estimation and cost estimation in software development. Estimations done in the initial stage of projects are based on requirements that may lead to success or failure of the project. Accurate estimations lead to success and inaccurate estimates lead to failure. There is no one particular method which cloud do accurate estimations. In this work, we propose Machine learning techniques linear regression and K-nearest Neighbors to predict Software Effort estimation using COCOMO81, COCOMONasa, and COCOMONasa2 datasets. The results obtained from these two methods have been compared. The 80% data in data sets used for training and remaining used as the test set. The correlation coefficient, Mean squared error (MSE) and Mean magnitude relative error (MMRE) are used as performance metrics. The experimental results show that these models forecast the software effort accurately.


2011 ◽  
Vol 7 (3) ◽  
pp. 41-53 ◽  
Author(s):  
Jeremiah D. Deng ◽  
Martin Purvis ◽  
Maryam Purvis

Software development effort estimation is important for quality management in the software development industry, yet its automation still remains a challenging issue. Applying machine learning algorithms alone often cannot achieve satisfactory results. This paper presents an integrated data mining framework that incorporates domain knowledge into a series of data analysis and modeling processes, including visualization, feature selection, and model validation. An empirical study on the software effort estimation problem using a benchmark dataset shows the necessity and effectiveness of the proposed approach.


2021 ◽  
Vol 29 (2) ◽  
Author(s):  
Pantjawati Sudarmaningtyas ◽  
Rozlina Mohamed

Currently, Agile software development method has been commonly used in software development projects, and the success rate is higher than waterfall projects. The effort estimation in Agile is still a challenge because most existing means are developed based on the conventional method. Therefore, this study aimed to ascertain the software effort estimation method that is applied in Agile, the implementation approach, and the attributes that affect effort estimation. The results showed the top three estimation that is applied in Agile, are machine learning (37%), Expert Judgement (26%), and Algorithmic (21%). The implementation of all machine learning methods used a hybrid approach, which is a combination of machine learning and expert judgement, or a mix of two or more machine learning. Meanwhile, the implementation of effort estimation through a hybrid approach was only used in 47% of relevant articles. In addition, effort estimation in Agile involved twenty-four attributes, where Complexity, Experience, Size, and Time are the most commonly used and implemented.


2016 ◽  
Vol 1 (1) ◽  
pp. 28 ◽  
Author(s):  
Dinda Novitasari ◽  
Imam Cholissodin ◽  
Wayan Firdaus Mahmudy

Abstract. In the software industry world, it’s known to fulfill the tremendous demand. Therefore, estimating effort is needed to optimize the accuracy of the results, because it has the weakness in the personal analysis of experts who tend to be less objective. SVR is one of clever algorithm as machine learning methods that can be used. There are two problems when applying it; select features and find optimal parameter value. This paper proposed local best PSO-SVR to solve the problem. The result of experiment showed that the proposed model outperforms PSO-SVR and T-SVR in accuracy. Keywords: Optimization, SVR, Optimal Parameter, Feature Selection, Local Best PSO, Software Effort Estimation


2010 ◽  
Vol 52 (11) ◽  
pp. 1155-1166 ◽  
Author(s):  
Adriano L.I. Oliveira ◽  
Petronio L. Braga ◽  
Ricardo M.F. Lima ◽  
Márcio L. Cornélio

Electronics ◽  
2021 ◽  
Vol 10 (10) ◽  
pp. 1195
Author(s):  
Priya Varshini A G ◽  
Anitha Kumari K ◽  
Vijayakumar Varadarajan

Software Project Estimation is a challenging and important activity in developing software projects. Software Project Estimation includes Software Time Estimation, Software Resource Estimation, Software Cost Estimation, and Software Effort Estimation. Software Effort Estimation focuses on predicting the number of hours of work (effort in terms of person-hours or person-months) required to develop or maintain a software application. It is difficult to forecast effort during the initial stages of software development. Various machine learning and deep learning models have been developed to predict the effort estimation. In this paper, single model approaches and ensemble approaches were considered for estimation. Ensemble techniques are the combination of several single models. Ensemble techniques considered for estimation were averaging, weighted averaging, bagging, boosting, and stacking. Various stacking models considered and evaluated were stacking using a generalized linear model, stacking using decision tree, stacking using a support vector machine, and stacking using random forest. Datasets considered for estimation were Albrecht, China, Desharnais, Kemerer, Kitchenham, Maxwell, and Cocomo81. Evaluation measures used were mean absolute error, root mean squared error, and R-squared. The results proved that the proposed stacking using random forest provides the best results compared with single model approaches using the machine or deep learning algorithms and other ensemble techniques.


2017 ◽  
Vol 27 (1) ◽  
pp. 169-180 ◽  
Author(s):  
Marton Szemenyei ◽  
Ferenc Vajda

Abstract Dimension reduction and feature selection are fundamental tools for machine learning and data mining. Most existing methods, however, assume that objects are represented by a single vectorial descriptor. In reality, some description methods assign unordered sets or graphs of vectors to a single object, where each vector is assumed to have the same number of dimensions, but is drawn from a different probability distribution. Moreover, some applications (such as pose estimation) may require the recognition of individual vectors (nodes) of an object. In such cases it is essential that the nodes within a single object remain distinguishable after dimension reduction. In this paper we propose new discriminant analysis methods that are able to satisfy two criteria at the same time: separating between classes and between the nodes of an object instance. We analyze and evaluate our methods on several different synthetic and real-world datasets.


Sign in / Sign up

Export Citation Format

Share Document