scholarly journals Applying Variant Variable Regularized Logistic Regression for Modeling Software Defect Predictor

2016 ◽  
Vol 4 (2) ◽  
pp. 107-115 ◽  
Author(s):  
Gabriel Kofi Armah ◽  
Guanchun Luo ◽  
Ke Qin ◽  
Angolo Shem Mbandu
2021 ◽  
Vol 5 (1) ◽  
pp. 233
Author(s):  
Andre Hardoni ◽  
Dian Palupi Rini ◽  
Sukemi Sukemi

Software defects are one of the main contributors to information technology waste and lead to rework, thus consuming a lot of time and money. Software defect prediction has the objective of defect prevention by classifying certain modules as defective or not defective. Many researchers have conducted research in the field of software defect prediction using NASA MDP public datasets, but these datasets still have shortcomings such as class imbalance and noise attribute. The class imbalance problem can be overcome by utilizing SMOTE (Synthetic Minority Over-sampling Technique) and the noise attribute problem can be solved by selecting features using Particle Swarm Optimization (PSO), So in this research, the integration between SMOTE and PSO is applied to the classification technique machine learning naïve Bayes and logistic regression. From the results of experiments that have been carried out on 8 NASA MDP datasets by dividing the dataset into training and testing data, it is found that the SMOTE + PSO integration in each classification technique can improve classification performance with the highest AUC (Area Under Curve) value on average 0,89 on logistic regression and 0,86 in naïve Bayes in the training and at the same time better than without combining the two.


2020 ◽  
Vol 17 (5) ◽  
pp. 721-730
Author(s):  
Kamal Bashir ◽  
Tianrui Li ◽  
Mahama Yahaya

The most frequently used machine learning feature ranking approaches failed to present optimal feature subset for accurate prediction of defective software modules in out-of-sample data. Machine learning Feature Selection (FS) algorithms such as Chi-Square (CS), Information Gain (IG), Gain Ratio (GR), RelieF (RF) and Symmetric Uncertainty (SU) perform relatively poor at prediction, even after balancing class distribution in the training data. In this study, we propose a novel FS method based on the Maximum Likelihood Logistic Regression (MLLR). We apply this method on six software defect datasets in their sampled and unsampled forms to select useful features for classification in the context of Software Defect Prediction (SDP). The Support Vector Machine (SVM) and Random Forest (RaF) classifiers are applied on the FS subsets that are based on sampled and unsampled datasets. The performance of the models captured using Area Ander Receiver Operating Characteristics Curve (AUC) metrics are compared for all FS methods considered. The Analysis Of Variance (ANOVA) F-test results validate the superiority of the proposed method over all the FS techniques, both in sampled and unsampled data. The results confirm that the MLLR can be useful in selecting optimal feature subset for more accurate prediction of defective modules in software development process


Author(s):  
Damien Wilburn

Hydrocephalus is a disorder where cerebrospinal fluid (CSF) is unable to drain efficiently from the brain. This paper presents a set of exploratory analyses comparing attributes of inpatients under one-year old diagnosed with hydrocephalus provided by the Agency for Healthcare Research and Quality (AHRQ) as part of the National Inpatient Sample (NIS). The general methods include calculation of summary statistics, kernel density estimation, logistic regression, linear regression, and the production of figures and charts using the statistical data modeling software, SAS. It was determined that younger infants show higher mortality rates; additionally, males are more likely to present hydrocephalus and cost slightly more on average than females despite the distribution curves for length of stay appearing virtually identical between genders. Diagnoses and procedures expected for non-hydrocephalic infants showed a negative correlation in the logistic model. The study overall validates much of the literature and expands it with a cost analysis approach.


2008 ◽  
Vol 14 (2) ◽  
pp. 165-186 ◽  
Author(s):  
Rattikorn Hewett ◽  
Phongphun Kijsanayothin

2007 ◽  
Vol 23 (3) ◽  
pp. 157-165 ◽  
Author(s):  
Carmen Hagemeister

Abstract. When concentration tests are completed repeatedly, reaction time and error rate decrease considerably, but the underlying ability does not improve. In order to overcome this validity problem this study aimed to test if the practice effect between tests and within tests can be useful in determining whether persons have already completed this test. The power law of practice postulates that practice effects are greater in unpracticed than in practiced persons. Two experiments were carried out in which the participants completed the same tests at the beginning and at the end of two test sessions set about 3 days apart. In both experiments, the logistic regression could indeed classify persons according to previous practice through the practice effect between the tests at the beginning and at the end of the session, and, less well but still significantly, through the practice effect within the first test of the session. Further analyses showed that the practice effects correlated more highly with the initial performance than was to be expected for mathematical reasons; typically persons with long reaction times have larger practice effects. Thus, small practice effects alone do not allow one to conclude that a person has worked on the test before.


2012 ◽  
Vol 2 (2) ◽  
pp. 72-81
Author(s):  
Christina M. Rudin-Brown ◽  
Eve Mitsopoulos-Rubens ◽  
Michael G. Lenné

Random testing for alcohol and other drugs (AODs) in individuals who perform safety-sensitive activities as part of their aviation role was introduced in Australia in April 2009. One year later, an online survey (N = 2,226) was conducted to investigate attitudes, behaviors, and knowledge regarding random testing and to gauge perceptions regarding its effectiveness. Private, recreational, and student pilots were less likely than industry personnel to report being aware of the requirement (86.5% versus 97.1%), to have undergone testing (76.5% versus 96.1%), and to know of others who had undergone testing (39.9% versus 84.3%), and they had more positive attitudes toward random testing than industry personnel. However, logistic regression analyses indicated that random testing is more effective at deterring AOD use among industry personnel.


2001 ◽  
Vol 6 (1) ◽  
pp. 35-48 ◽  
Author(s):  
Michaela Kiernan ◽  
Helena C. Kraemer ◽  
Marilyn A. Winkleby ◽  
Abby C. King ◽  
C. Barr Taylor

Sign in / Sign up

Export Citation Format

Share Document