Comments on "Researcher bias: The use of machine learning in software defect prediction"

10.7287/peerj.preprints.1260v2 ◽

2016 ◽

Author(s):

Chakkrit Tantithamthavorn ◽

Shane McIntosh ◽

Ahmed E Hassan ◽

Kenichi Matsumoto

Keyword(s):

Prediction Model ◽

Research Group ◽

Strong Association ◽

Model Performance ◽

Strong Relationship ◽

Defect Prediction ◽

Explanatory Variables ◽

Software Defect ◽

The Impact ◽

The Relationship

Shepperd et al. find that the reported performance of a defect prediction model shares a strong relationship with the group of researchers who construct the models. In this paper, we perform an alternative investigation of Shepperd et al.’s data. We observe that (a) research group shares a strong association with other explanatory variables (i.e., the dataset and metric families that are used to build a model); (b) the strong association among these explanatory variables makes it difficult to discern the impact of the research group on model performance; and (c) after mitigating the impact of this strong association, we find that the research group has a smaller impact than the metric family. These observations lead us to conclude that the relationship between the researcher group and the performance of a defect prediction model are more likely due to the tendency of researchers to reuse experimental components (e.g., datasets and metrics). We recommend that researchers experiment with a broader selection of datasets and metrics to combat any potential bias in their results.

Download Full-text

Comments on "Researcher bias: The use of machine learning in software defect prediction"

10.7287/peerj.preprints.1260 ◽

2016 ◽

Author(s):

Chakkrit Tantithamthavorn ◽

Shane McIntosh ◽

Ahmed E Hassan ◽

Kenichi Matsumoto

Keyword(s):

Prediction Model ◽

Research Group ◽

Strong Association ◽

Model Performance ◽

Strong Relationship ◽

Defect Prediction ◽

Explanatory Variables ◽

Software Defect ◽

The Impact ◽

The Relationship

Shepperd et al. find that the reported performance of a defect prediction model shares a strong relationship with the group of researchers who construct the models. In this paper, we perform an alternative investigation of Shepperd et al.’s data. We observe that (a) research group shares a strong association with other explanatory variables (i.e., the dataset and metric families that are used to build a model); (b) the strong association among these explanatory variables makes it difficult to discern the impact of the research group on model performance; and (c) after mitigating the impact of this strong association, we find that the research group has a smaller impact than the metric family. These observations lead us to conclude that the relationship between the researcher group and the performance of a defect prediction model are more likely due to the tendency of researchers to reuse experimental components (e.g., datasets and metrics). We recommend that researchers experiment with a broader selection of datasets and metrics to combat any potential bias in their results.

Download Full-text

Research of Software Defect Prediction Model Based on ACO-SVM

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2011.01148 ◽

2011 ◽

Vol 34 (6) ◽

pp. 1148-1154 ◽

Cited By ~ 13

Author(s):

Hui-Yan JIANG ◽

Mao ZONG ◽

Xiang-Ying LIU

Keyword(s):

Prediction Model ◽

Defect Prediction ◽

Software Defect Prediction ◽

Model Based ◽

Software Defect

Download Full-text

The Impact of Alkaloid-Producing Epichloë Endophyte on Forage Ryegrass Breeding: A New Zealand Perspective

Toxins ◽

10.3390/toxins13020158 ◽

2021 ◽

Vol 13 (2) ◽

pp. 158

Author(s):

Colin Eady

Keyword(s):

Seed Production ◽

Health Issues ◽

Abiotic Stressors ◽

Livestock Health ◽

The Right ◽

The Impact ◽

The Relationship ◽

And Storage ◽

Selection Of ◽

Epichloë Endophyte

For 30 years, forage ryegrass breeding has known that the germplasm may contain a maternally inherited symbiotic Epichloë endophyte. These endophytes produce a suite of secondary alkaloid compounds, dependent upon strain. Many produce ergot and other alkaloids, which are associated with both insect deterrence and livestock health issues. The levels of alkaloids and other endophyte characteristics are influenced by strain, host germplasm, and environmental conditions. Some strains in the right host germplasm can confer an advantage over biotic and abiotic stressors, thus acting as a maternally inherited desirable ‘trait’. Through seed production, these mutualistic endophytes do not transmit into 100% of the crop seed and are less vigorous than the grass seed itself. This causes stability and longevity issues for seed production and storage should the ‘trait’ be desired in the germplasm. This makes understanding the precise nature of the relationship vitally important to the plant breeder. These Epichloë endophytes cannot be ‘bred’ in the conventional sense, as they are asexual. Instead, the breeder may modulate endophyte characteristics through selection of host germplasm, a sort of breeding by proxy. This article explores, from a forage seed company perspective, the issues that endophyte characteristics and breeding them by proxy have on ryegrass breeding, and outlines the methods used to assess the ‘trait’, and the application of these through the breeding, production, and deployment processes. Finally, this article investigates opportunities for enhancing the utilisation of alkaloid-producing endophytes within pastures, with a focus on balancing alkaloid levels to further enhance pest deterrence and improving livestock outcomes.

Download Full-text

The impact of the distance metric and measure on SMOTE-based techniques in software defect prediction

Information and Software Technology ◽

10.1016/j.infsof.2021.106742 ◽

2021 ◽

pp. 106742

Author(s):

Shuo Feng ◽

Jacky Keung ◽

Peichang Zhang ◽

Yan Xiao ◽

Miao Zhang

Keyword(s):

Defect Prediction ◽

Software Defect Prediction ◽

Distance Metric ◽

Software Defect ◽

The Impact

Download Full-text

Establishing a software defect prediction model via effective dimension reduction

Information Sciences ◽

10.1016/j.ins.2018.10.056 ◽

2019 ◽

Vol 477 ◽

pp. 399-409 ◽

Cited By ~ 7

Author(s):

Hua Wei ◽

Changzhen Hu ◽

Shiyou Chen ◽

Yuan Xue ◽

Quanxin Zhang

Keyword(s):

Prediction Model ◽

Dimension Reduction ◽

Defect Prediction ◽

Software Defect Prediction ◽

Effective Dimension ◽

Software Defect ◽

Effective Dimension Reduction

Download Full-text

Revisiting the Impact of Dependency Network Metrics on Software Defect Prediction

IEEE Transactions on Software Engineering ◽

10.1109/tse.2021.3131950 ◽

2021 ◽

pp. 1-1

Author(s):

Lina Gong ◽

Gopi Krishnan Krishnan Rajbahadur ◽

Ahmed E. Hassan ◽

S. Jiang

Keyword(s):

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Dependency Network ◽

Network Metrics ◽

The Impact

Download Full-text

The Suitability of the Satellite Metrological Inputs Source on the Hydrological Model in a Small Urban Catchment

10.20944/preprints201608.0134.v1 ◽

2016 ◽

Author(s):

Song Song ◽

Youpeng Xu ◽

Jiali Wang ◽

Jinkang Du ◽

Jianxin Zhang ◽

...

Keyword(s):

Spatial Resolution ◽

Potential Evapotranspiration ◽

Model Performance ◽

Yangtze River Delta ◽

Precipitation Data ◽

Model Quality ◽

Small Catchment ◽

Explanatory Variables ◽

Distributed Models ◽

The Impact

Distributed/semi-distributed models are considered to be sensitive to the spatial resolution of the data input. In this paper, we take a small catchment in high urbanized Yangtze River Delta, Qinhuai catchment as study area, to analyze the impact of spatial resolution of precipitation and the potential evapotranspiration (PET) on the long-term runoff and flood runoff process. The data source includes the TRMM precipitation data, FEWS download PET data, and the interpolated metrological station data. GIS/RS technique was used to collect and pre-process the geographical, precipitation and PET series, which were then served as the input of CREST (Coupled Routing and Excess Storage) model to simulate the runoff process. The results clearly showed that, the CREST model is applicable to the Qinhuai catchment; the spatial resolution of precipitation had strong influence on the modelled runoff results and the metrological precipitation data cannot be substituted by the TRMM data in small catchment; the CREST model was not sensitive to the spatial resolution of the PET data, while the estimation fourmula of the PET data was correlated with the model quality. This paper focused on the small urbanized catchment, suggesting the influential explanatory variables for the model performance, and providing reliable reference for the study in similar area.

Download Full-text

Fundamental factors influencing returns of shares listed on the Johannesburg Stock Exchange in South Africa

Journal of Economic and Financial Sciences ◽

10.4102/jef.v9i2.50 ◽

2017 ◽

Vol 9 (2) ◽

pp. 426-435

Author(s):

Marise Vermeulen

Keyword(s):

Stock Exchange ◽

Dividend Payout ◽

Return On Equity ◽

Explanatory Variables ◽

Johannesburg Stock Exchange ◽

Earnings Per Share ◽

Asset Growth ◽

Book To Market Ratio ◽

The Impact ◽

The Relationship

This study investigated the relationship between share returns and nine variables that had been proven to influence returns in previous research, using a multiple regression analysis. These variables are size, leverage, book-to-market ratio, earnings yield, dividend payout, earnings growth, return on equity, earnings per share and asset growth. The impact of some of the variables on share returns proved to be insignificant, and some collinearity was identified between some of the variables. However, three significant variables were identified and the final regression model included the book-to-market ratio, dividend payout and leverage as the explanatory variables.

Download Full-text

Research of Software Defect Prediction Model Based on Gray Theory

2009 International Conference on Management and Service Science ◽

10.1109/icmss.2009.5301677 ◽

2009 ◽

Cited By ~ 1

Author(s):

Zhuo-yuan Xiang ◽

Zhitao Tang

Keyword(s):

Prediction Model ◽

Defect Prediction ◽

Software Defect Prediction ◽

Model Based ◽

Software Defect ◽

Gray Theory

Download Full-text