RESAMPLING METHODS IN SOFTWARE QUALITY CLASSIFICATION

In the presence of a number of algorithms for classification and prediction in software engineering, there is a need to have a systematic way of assessing their performances. The performance assessment is typically done by some form of partitioning or resampling of the original data to alleviate biased estimation. For predictive and classification studies in software engineering, there is a lack of a definitive advice on the most appropriate resampling method to use. This is seen as one of the contributing factors for not being able to draw general conclusions on what modeling technique or set of predictor variables are the most appropriate. Furthermore, the use of a variety of resampling methods make it impossible to perform any formal meta-analysis of the primary study results. Therefore, it is desirable to examine the influence of various resampling methods and to quantify possible differences. Objective and method: This study empirically compares five common resampling methods (hold-out validation, repeated random sub-sampling, 10-fold cross-validation, leave-one-out cross-validation and non-parametric bootstrapping) using 8 publicly available data sets with genetic programming (GP) and multiple linear regression (MLR) as software quality classification approaches. Location of (PF, PD) pairs in the ROC (receiver operating characteristics) space and area under an ROC curve (AUC) are used as accuracy indicators. Results: The results show that in terms of the location of (PF, PD) pairs in the ROC space, bootstrapping results are in the preferred region for 3 of the 8 data sets for GP and for 4 of the 8 data sets for MLR. Based on the AUC measure, there are no significant differences between the different resampling methods using GP and MLR. Conclusion: There can be certain data set properties responsible for insignificant differences between the resampling methods based on AUC. These include imbalanced data sets, insignificant predictor variables and high-dimensional data sets. With the current selection of data sets and classification techniques, bootstrapping is a preferred method based on the location of (PF, PD) pair data in the ROC space. Hold-out validation is not a good choice for comparatively smaller data sets, where leave-one-out cross-validation (LOOCV) performs better. For comparatively larger data sets, 10-fold cross-validation performs better than LOOCV.

Download Full-text

Study onYang-XuUsing Body Constitution Questionnaire and Blood Variables in Healthy Volunteers

Evidence-based Complementary and Alternative Medicine ◽

10.1155/2016/9437382 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7 ◽

Cited By ~ 7

Author(s):

Hong-Jhang Chen ◽

Yii-Jeng Lin ◽

Pei-Chen Wu ◽

Wei-Hsiang Hsu ◽

Wan-Chung Hu ◽

...

Keyword(s):

Healthy Subjects ◽

Logistic Regression Model ◽

Cross Validation ◽

Blood Biomarkers ◽

Metabolic Characteristics ◽

Body Constitution ◽

Leave One Out ◽

The Relationship ◽

Fold Cross Validation ◽

Blood Variables

Traditional Chinese medicine (TCM) formulates treatment according to body constitution (BC) differentiation. Different constitutions have specific metabolic characteristics and different susceptibility to certain diseases. This study aimed to assess theYang-Xuconstitution using a body constitution questionnaire (BCQ) and clinical blood variables. A BCQ was employed to assess the clinical manifestation ofYang-Xu. The logistic regression model was conducted to explore the relationship between BC scores and biomarkers. Leave-one-out cross-validation (LOOCV) and K-fold cross-validation were performed to evaluate the accuracy of a predictive model in practice. Decision trees (DTs) were conducted to determine the possible relationships between blood biomarkers and BC scores. According to the BCQ analysis, 49% participants without any BC were classified as healthy subjects. Among them, 130 samples were selected for further analysis and divided into two groups. One group comprised healthy subjects without any BC (68%), while subjects of the other group, named as the sub-healthy group, had three BCs (32%). Six biomarkers, CRE, TSH, HB, MONO, RBC, and LH, were found to have the greatest impact on BCQ outcomes inYang-Xusubjects. This study indicated significant biochemical differences inYang-Xusubjects, which may provide a connection between blood variables and theYang-XuBC.

Download Full-text

IILLS: predicting virus-receptor interactions based on similarity and semi-supervised learning

BMC Bioinformatics ◽

10.1186/s12859-019-3278-3 ◽

2019 ◽

Vol 20 (S23) ◽

Cited By ~ 3

Author(s):

Cheng Yan ◽

Guihua Duan ◽

Fang-Xiang Wu ◽

Jianxin Wang

Keyword(s):

Infectious Diseases ◽

Cross Validation ◽

Sequence Similarity ◽

Least Square ◽

Computational Method ◽

Receptor Interaction ◽

Virus Receptor ◽

Receptor Interactions ◽

Leave One Out ◽

Fold Cross Validation

Abstract Background Viral infectious diseases are the serious threat for human health. The receptor-binding is the first step for the viral infection of hosts. To more effectively treat human viral infectious diseases, the hidden virus-receptor interactions must be discovered. However, current computational methods for predicting virus-receptor interactions are limited. Result In this study, we propose a new computational method (IILLS) to predict virus-receptor interactions based on Initial Interaction scores method via the neighbors and the Laplacian regularized Least Square algorithm. IILLS integrates the known virus-receptor interactions and amino acid sequences of receptors. The similarity of viruses is calculated by the Gaussian Interaction Profile (GIP) kernel. On the other hand, we also compute the receptor GIP similarity and the receptor sequence similarity. Then the sequence similarity is used as the final similarity of receptors according to the prediction results. The 10-fold cross validation (10CV) and leave one out cross validation (LOOCV) are used to assess the prediction performance of our method. We also compare our method with other three competing methods (BRWH, LapRLS, CMF). Conlusion The experiment results show that IILLS achieves the AUC values of 0.8675 and 0.9061 with the 10-fold cross validation and leave-one-out cross validation (LOOCV), respectively, which illustrates that IILLS is superior to the competing methods. In addition, the case studies also further indicate that the IILLS method is effective for the virus-receptor interaction prediction.

Download Full-text

Parametric methods for comparing the performance of two classification algorithms evaluated by k-fold cross validation on multiple data sets

Pattern Recognition ◽

10.1016/j.patcog.2016.12.018 ◽

2017 ◽

Vol 65 ◽

pp. 97-107 ◽

Cited By ~ 17

Author(s):

Tzu-Tsung Wong

Keyword(s):

Cross Validation ◽

Classification Algorithms ◽

Data Sets ◽

Parametric Methods ◽

Multiple Data ◽

Multiple Data Sets ◽

Fold Cross Validation

Download Full-text

Rapid Identification of COVID-19 Severity in CT Scans through Classification of Deep Features

10.21203/rs.3.rs-30802/v1 ◽

2020 ◽

Author(s):

Zekuan Yu ◽

Xiaohu Li ◽

Haitao Sun ◽

Jian Wang ◽

Tongtong Zhao ◽

...

Keyword(s):

Cross Validation ◽

Rapid Identification ◽

Ct Scans ◽

Accurate Identification ◽

Linear Discriminant ◽

Holdout Validation ◽

Novel Coronavirus ◽

Leave One Out ◽

Fold Cross Validation

Abstract Background: To implement the real-time diagnosis of the severity of patients infected with novel coronavirus 2019 (COVID-19) and guide the follow-up therapeutic treatment, We collected chest CT scans of 202 patients diagnosed with the COVID-19 from three hospitals in Anhui Province, China.Methods: A total of 729 2D axial plan slices with 246 severe cases and 483 non-severe cases were employed in this study. Four pre-trained deep models (Inception-V3, ResNet-50, ResNet-101, DenseNet-201) with multiple classifiers (linear discriminant, linear SVM, cubic SVM, KNN and Adaboost decision tree) were applied to identify the severe and non-severe COVID-19 cases. Three validation strategies (holdout validation, 10-fold cross-validation and leave-one-out) are employed to validate the feasibility of proposed pipelines. Results and conclusion: The experimental results demonstrate that classification of the features from pre-trained deep models show the promising application in COVID-19 screening whereas the DenseNet-201 with cubic SVM model achieved the best performance. Specifically, it achieved the highest severity classification accuracy of 95.20% and 95.34% for 10-fold cross-validation and leave-one-out, respectively. The established pipeline was able to achieve a rapid and accurate identification of the severity of COVID-19. This may assist the physicians to make more efficient and reliable decisions.

Download Full-text

Analyzing performance of classifiers for medical datasets

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.15.11370 ◽

2018 ◽

Vol 7 (2.15) ◽

pp. 136 ◽

Cited By ~ 1

Author(s):

Rosaida Rosly ◽

Mokhairi Makhtar ◽

Mohd Khalid Awang ◽

Mohd Isa Awang ◽

Mohd Nordin Abdul Rahman

Keyword(s):

Breast Cancer ◽

Cross Validation ◽

Ensemble Methods ◽

Data Sets ◽

Ensemble Classifiers ◽

Classification Models ◽

Data Set ◽

Mining Tool ◽

Fold Cross Validation

This paper analyses the performance of classification models using single classification and combination of ensemble method, which are Breast Cancer Wisconsin and Hepatitis data sets as training datasets. This paper presents a comparison of different classifiers based on a 10-fold cross validation using a data mining tool. In this experiment, various classifiers are implemented including three popular ensemble methods which are boosting, bagging and stacking for the combination. The result shows that for the classification of the Breast Cancer Wisconsin data set, the single classification of Naïve Bayes (NB) and a combination of bagging+NB algorithm displayed the highest accuracy at the same percentage (97.51%) compared to other combinations of ensemble classifiers. For the classification of the Hepatitisdata set, the result showed that the combination of stacking+Multi-Layer Perception (MLP) algorithm achieved a higher accuracy at 86.25%. By using the ensemble classifiers, the result may be improved. In future, a multi-classifier approach will be proposed by introducing a fusion at the classification level between these classifiers to obtain classification with higher accuracies.

Download Full-text

Credible Intervals for Precision and Recall Based on a K-Fold Cross-Validated Beta Distribution

Neural Computation ◽

10.1162/neco_a_00857 ◽

2016 ◽

Vol 28 (8) ◽

pp. 1694-1722 ◽

Cited By ~ 4

Author(s):

Yu Wang ◽

Jihong Li

Keyword(s):

Confidence Intervals ◽

Cross Validation ◽

Real Data ◽

Credible Interval ◽

Interval Length ◽

Data Sets ◽

Credible Intervals ◽

Degree Of Confidence ◽

T Distribution ◽

Fold Cross Validation

In typical machine learning applications such as information retrieval, precision and recall are two commonly used measures for assessing an algorithm's performance. Symmetrical confidence intervals based on K-fold cross-validated t distributions are widely used for the inference of precision and recall measures. As we confirmed through simulated experiments, however, these confidence intervals often exhibit lower degrees of confidence, which may easily lead to liberal inference results. Thus, it is crucial to construct faithful confidence (credible) intervals for precision and recall with a high degree of confidence and a short interval length. In this study, we propose two posterior credible intervals for precision and recall based on K-fold cross-validated beta distributions. The first credible interval for precision (or recall) is constructed based on the beta posterior distribution inferred by all K data sets corresponding to K confusion matrices from a K-fold cross-validation. Second, considering that each data set corresponding to a confusion matrix from a K-fold cross-validation can be used to infer a beta posterior distribution of precision (or recall), the second proposed credible interval for precision (or recall) is constructed based on the average of K beta posterior distributions. Experimental results on simulated and real data sets demonstrate that the first credible interval proposed in this study almost always resulted in degrees of confidence greater than 95%. With an acceptable degree of confidence, both of our two proposed credible intervals have shorter interval lengths than those based on a corrected K-fold cross-validated t distribution. Meanwhile, the average ranks of these two credible intervals are superior to that of the confidence interval based on a K-fold cross-validated t distribution for the degree of confidence and are superior to that of the confidence interval based on a corrected K-fold cross-validated t distribution for the interval length in all 27 cases of simulated and real data experiments. However, the confidence intervals based on the K-fold and corrected K-fold cross-validated t distributions are in the two extremes. Thus, when focusing on the reliability of the inference for precision and recall, the proposed methods are preferable, especially for the first credible interval.

Download Full-text

Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification

2016 IEEE 6th International Conference on Advanced Computing (IACC) ◽

10.1109/iacc.2016.25 ◽

2016 ◽

Cited By ~ 63

Author(s):

Sanjay Yadav ◽

Sanyam Shukla

Keyword(s):

Cross Validation ◽

Quality Classification ◽

Fold Cross Validation

Download Full-text

QSPR study of supercooled liquid vapour pressures of PBDEs by using molecular distance-edge vector index

Journal of the Serbian Chemical Society ◽

10.2298/jsc140716087j ◽

2015 ◽

Vol 80 (4) ◽

pp. 499-508 ◽

Cited By ~ 1

Author(s):

Long Jiao ◽

Xiaofei Wang ◽

Shan Bing ◽

Zhiwei Xue ◽

Hua Li

Keyword(s):

Cross Validation ◽

Quantitative Relationship ◽

Supercooled Liquid ◽

Structure Property ◽

Ann Model ◽

Linear Network ◽

Prediction Ability ◽

Leave One Out ◽

Fold Cross Validation ◽

Edge Vector

The quantitative structure property relationship (QSPR) for supercooled liquid vapour pressures (PL) of PBDEs was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor. The quantitative relationship between the MDEV index and lgPL was modeled by using multivariate linear regression (MLR) and artificial neural network (ANN) respectively. Leave-one-out cross validation and k-fold cross validation were carried out to assess the prediction ability of the developed models. For the MLR method, the prediction root mean square relative error (RMSRE) of leave-one-out cross validation and k-fold cross validation is 9.95 and 9.05 respectively. For the ANN method, the prediction RMSRE of leave-one-out cross validation and k-fold cross validation is 8.75 and 8.31 respectively. It is demonstrated the established models are practicable for predicting the lgPL of PBDEs. The MDEV index is quantitatively related to the lgPL of PBDEs. MLR and L-ANN are practicable for modeling this relationship. Compared with MLR, ANN shows slightly higher prediction accuracy. Subsequently, an MLR model, which regression equation is lgPL = 0.2868 M11 - 0.8449 M12 - 0.0605, and an ANN model, which is a two inputs linear network, were developed. The two models can be used to predict the lgPL of each PBDE.

Download Full-text

Importance of Holidays for Short Term Load Forecasting Using Adaptive Neural Fuzzy Inference System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.3959 ◽

2012 ◽

Vol 433-440 ◽

pp. 3959-3963 ◽

Cited By ~ 1

Author(s):

Bayram Akdemir ◽

Nurettin Çetinkaya

Keyword(s):

Energy Level ◽

Cross Validation ◽

Fuzzy Inference ◽

Load Forecasting ◽

Percentage Error ◽

Data Sets ◽

Data Set ◽

Inference System ◽

Peak Energy ◽

Fold Cross Validation

In distributing systems, load forecasting is one of the major management problems to carry on energy flowing; protect the systems, and economic management. In order to manage the system, next step of the load characteristics must be inform from historical data sets. For the forecasting, not only historical parameters are used but also external parameters such as weather conditions, seasons and populations and etc. have much importance to forecast the next behavior of the load characteristic. Holidays and week days have different affects on energy consumption in any country. In this study, target is to forecast the peak energy level the next an hour and to compare affects of week days and holidays on peak energy needs. Energy consumption data sets have nonlinear characteristics and it is not easy to fit any curve due to its nonlinearity and lots of parameters. In order to forecast peak energy level, Adaptive neural fuzzy inference system is used for hourly affects of holidays and week days on peak energy level is argued. The obtained values from output of the artificial intelligence are evaluated two fold cross validation and mean absolute percentage error. The obtained two fold cross validation error as mean absolute percentage error is 3.51 and included holidays data set has more accuracy than the data set without holiday. Total success increased 2.4%.

Download Full-text

PROTOTYPE CLASSIFIER DESIGN WITH PRUNING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213005002090 ◽

2005 ◽

Vol 14 (01n02) ◽

pp. 261-280 ◽

Cited By ~ 15

Author(s):

JIANG LI ◽

MICHAEL T. MANRY ◽

CHANGHUA YU ◽

D. RANDALL WILSON

Keyword(s):

Cross Validation ◽

Nearest Neighbor ◽

Data Sets ◽

Classifier Design ◽

Instance Based Learning ◽

Speed Up ◽

Fine Tune ◽

Fold Cross Validation ◽

Neighbor Classifier ◽

Generalization Accuracy

Algorithms reducing the storage requirement of the nearest neighbor classifier (NNC) can be divided into three main categories: Fast searching algorithms, Instance-based learning algorithms and Prototype based algorithms. We propose an algorithm, LVQPRU, for pruning NNC prototype vectors and a compact classifier with good performance is obtained. The basic condensing algorithm is applied to the initial prototypes to speed up the learning process. The learning vector quantization (LVQ) algorithm is utilized to fine tune the remaining prototypes during each pruning iteration. We evaluate LVQPRU on several data sets along with 12 other algorithms using ten-fold cross-validation. Simulation results show that the proposed algorithm has high generalization accuracy and good storage reduction ratios.

Download Full-text