BenchMetrics: a systematic benchmarking method for binary classification performance metrics

Selecting the proper performance metric constitutes a key issue for most classification problems in the field of machine learning. Although the specialized literature has addressed several topics regarding these metrics, their symmetries have yet to be systematically studied. This research focuses on ten metrics based on a binary confusion matrix and their symmetric behaviour is formally defined under all types of transformations. Through simulated experiments, which cover the full range of datasets and classification results, the symmetric behaviour of these metrics is explored by exposing them to hundreds of simple or combined symmetric transformations. Cross-symmetries among the metrics and statistical symmetries are also explored. The results obtained show that, in all cases, three and only three types of symmetries arise: labelling inversion (between positive and negative classes); scoring inversion (concerning good and bad classifiers); and the combination of these two inversions. Additionally, certain metrics have been shown to be independent of the imbalance in the dataset and two cross-symmetries have been identified. The results regarding their symmetries reveal a deeper insight into the behaviour of various performance metrics and offer an indicator to properly interpret their values and a guide for their selection for certain specific applications.

Download Full-text

Phybrata Sensors and Machine Learning for Enhanced Neurophysiological Diagnosis and Treatment

Sensors ◽

10.3390/s21217417 ◽

2021 ◽

Vol 21 (21) ◽

pp. 7417

Author(s):

Alex J. Hope ◽

Utkarsh Vashisth ◽

Matthew J. Parker ◽

Andreas B. Ralston ◽

Joshua M. Roper ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Random Forest ◽

Binary Classification ◽

Classification Performance ◽

Support Vector ◽

Use Case ◽

Signal Features ◽

Test Population

Concussion injuries remain a significant public health challenge. A significant unmet clinical need remains for tools that allow related physiological impairments and longer-term health risks to be identified earlier, better quantified, and more easily monitored over time. We address this challenge by combining a head-mounted wearable inertial motion unit (IMU)-based physiological vibration acceleration (“phybrata”) sensor and several candidate machine learning (ML) models. The performance of this solution is assessed for both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments. Results are compared with previously reported approaches to ML-based concussion diagnostics. Using phybrata data from a previously reported concussion study population, four different machine learning models (Support Vector Machine, Random Forest Classifier, Extreme Gradient Boost, and Convolutional Neural Network) are first investigated for binary classification of the test population as healthy vs. concussion (Use Case 1). Results are compared for two different data preprocessing pipelines, Time-Series Averaging (TSA) and Non-Time-Series Feature Extraction (NTS). Next, the three best-performing NTS models are compared in terms of their multiclass prediction performance for specific concussion-related impairments: vestibular, neurological, both (Use Case 2). For Use Case 1, the NTS model approach outperformed the TSA approach, with the two best algorithms achieving an F1 score of 0.94. For Use Case 2, the NTS Random Forest model achieved the best performance in the testing set, with an F1 score of 0.90, and identified a wider range of relevant phybrata signal features that contributed to impairment classification compared with manual feature inspection and statistical data analysis. The overall classification performance achieved in the present work exceeds previously reported approaches to ML-based concussion diagnostics using other data sources and ML models. This study also demonstrates the first combination of a wearable IMU-based sensor and ML model that enables both binary classification of concussion patients and multiclass predictions of specific concussion-related neurophysiological impairments.

Download Full-text

On the parameter optimization of Support Vector Machines for binary classification

Journal of Integrative Bioinformatics ◽

10.1515/jib-2012-201 ◽

2012 ◽

Vol 9 (3) ◽

pp. 33-43 ◽

Cited By ~ 30

Author(s):

Paulo Gaspar ◽

Jaime Carbonell ◽

José Luís Oliveira

Keyword(s):

Support Vector Machines ◽

Binary Classification ◽

Classification Performance ◽

Biological Data ◽

Parameters Optimization ◽

Support Vector ◽

Minimal Risk ◽

Class Separation ◽

Vector Machines ◽

Analyse Data

Summary Classifying biological data is a common task in the biomedical context. Predicting the class of new, unknown information allows researchers to gain insight and make decisions based on the available data. Also, using classification methods often implies choosing the best parameters to obtain optimal class separation, and the number of parameters might be large in biological datasets.Support Vector Machines provide a well-established and powerful classification method to analyse data and find the minimal-risk separation between different classes. Finding that separation strongly depends on the available feature set and the tuning of hyper-parameters. Techniques for feature selection and SVM parameters optimization are known to improve classification accuracy, and its literature is extensive.In this paper we review the strategies that are used to improve the classification performance of SVMs and perform our own experimentation to study the influence of features and hyper-parameters in the optimization process, using several known kernels.

Download Full-text

Automated detection of pneumonia cases using deep transfer learning with paediatric chest X-ray images

British Journal of Radiology ◽

10.1259/bjr.20201263 ◽

2021 ◽

pp. 20201263

Author(s):

Mohammad Salehi ◽

Reza Mohammadi ◽

Hamed Ghaffari ◽

Nahid Sadighi ◽

Reza Reiazi

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Area Under Curve ◽

Binary Classification ◽

Classification Performance ◽

Learning Approach ◽

Insufficient Data ◽

X Ray ◽

Chest X Ray

Objective: Pneumonia is a lung infection and causes the inflammation of the small air sacs (Alveoli) in one or both lungs. Proper and faster diagnosis of pneumonia at an early stage is imperative for optimal patient care. Currently, chest X-ray is considered as the best imaging modality for diagnosing pneumonia. However, the interpretation of chest X-ray images is challenging. To this end, we aimed to use an automated convolutional neural network-based transfer-learning approach to detect pneumonia in paediatric chest radiographs. Methods: Herein, an automated convolutional neural network-based transfer-learning approach using four different pre-trained models (i.e. VGG19, DenseNet121, Xception, and ResNet50) was applied to detect pneumonia in children (1–5 years) chest X-ray images. The performance of different proposed models for testing data set was evaluated using five performances metrics, including accuracy, sensitivity/recall, Precision, area under curve, and F1 score. Results: All proposed models provide accuracy greater than 83.0% for binary classification. The pre-trained DenseNet121 model provides the highest classification performance of automated pneumonia classification with 86.8% accuracy, followed by Xception model with an accuracy of 86.0%. The sensitivity of the proposed models was greater than 91.0%. The Xception and DenseNet121 models achieve the highest classification performance with F1-score greater than 89.0%. The plotted area under curve of receiver operating characteristics of VGG19, Xception, ResNet50, and DenseNet121 models are 0.78, 0.81, 0.81, and 0.86, respectively. Conclusion: Our data showed that the proposed models achieve a high accuracy for binary classification. Transfer learning was used to accelerate training of the proposed models and resolve the problem associated with insufficient data. We hope that these proposed models can help radiologists for a quick diagnosis of pneumonia at radiology departments. Moreover, our proposed models may be useful to detect other chest-related diseases such as novel Coronavirus 2019. Advances in knowledge: Herein, we used transfer learning as a machine learning approach to accelerate training of the proposed models and resolve the problem associated with insufficient data. Our proposed models achieved accuracy greater than 83.0% for binary classification.

Download Full-text

Matrix Product State–Based Quantum Classifier

Neural Computation ◽

10.1162/neco_a_01202 ◽

2019 ◽

Vol 31 (7) ◽

pp. 1499-1517 ◽

Cited By ~ 2

Author(s):

Amandeep Singh Bhatia ◽

Mandeep Kaur Saggi ◽

Ajay Kumar ◽

Sushma Jain

Keyword(s):

Performance Metrics ◽

Quantum Computer ◽

Binary Classification ◽

Learning Ability ◽

Matrix Product ◽

Product State ◽

Data Set ◽

Matrix Product State ◽

Tensor Network ◽

Network States

Interest in quantum computing has increased significantly. Tensor network theory has become increasingly popular and widely used to simulate strongly entangled correlated systems. Matrix product state (MPS) is a well-designed class of tensor network states that plays an important role in processing quantum information. In this letter, we show that MPS, as a one-dimensional array of tensors, can be used to classify classical and quantum data. We have performed binary classification of the classical machine learning data set Iris encoded in a quantum state. We have also investigated its performance by considering different parameters on the ibmqx4 quantum computer and proved that MPS circuits can be used to attain better accuracy. Furthermore the learning ability of an MPS quantum classifier is tested to classify evapotranspiration (ET[Formula: see text]) for the Patiala meteorological station located in northern Punjab (India), using three years of a historical data set (Agri). We have used different performance metrics of classification to measure its capability. Finally, the results are plotted and the degree of correspondence among values of each sample is shown.

Download Full-text

The impact of class imbalance in classification performance metrics based on the binary confusion matrix

Pattern Recognition ◽

10.1016/j.patcog.2019.02.023 ◽

2019 ◽

Vol 91 ◽

pp. 216-231 ◽

Cited By ~ 52

Author(s):

Amalia Luque ◽

Alejandro Carrasco ◽

Alejandro Martín ◽

Ana de las Heras

Keyword(s):

Performance Metrics ◽

Confusion Matrix ◽

Class Imbalance ◽

Classification Performance ◽

The Impact

Download Full-text

Utility of MemTrax and Machine Learning Modeling in Classification of Mild Cognitive Impairment

Journal of Alzheimer s Disease ◽

10.3233/jad-191340 ◽

2020 ◽

Vol 77 (4) ◽

pp. 1545-1558

Author(s):

Michael F. Bergeron ◽

Sara Landset ◽

Xianbo Zhou ◽

Tao Ding ◽

Taghi M. Khoshgoftaar ◽

...

Keyword(s):

Machine Learning ◽

Cognitive Impairment ◽

Mild Cognitive Impairment ◽

Performance Metrics ◽

Early Stage ◽

Characteristic Curve ◽

Classification Performance ◽

Cognitive Screening ◽

Cross Sectional ◽

Machine Learning Classification

Background: The widespread incidence and prevalence of Alzheimer’s disease and mild cognitive impairment (MCI) has prompted an urgent call for research to validate early detection cognitive screening and assessment. Objective: Our primary research aim was to determine if selected MemTrax performance metrics and relevant demographics and health profile characteristics can be effectively utilized in predictive models developed with machine learning to classify cognitive health (normal versus MCI), as would be indicated by the Montreal Cognitive Assessment (MoCA). Methods: We conducted a cross-sectional study on 259 neurology, memory clinic, and internal medicine adult patients recruited from two hospitals in China. Each patient was given the Chinese-language MoCA and self-administered the continuous recognition MemTrax online episodic memory test on the same day. Predictive classification models were built using machine learning with 10-fold cross validation, and model performance was measured using Area Under the Receiver Operating Characteristic Curve (AUC). Models were built using two MemTrax performance metrics (percent correct, response time), along with the eight common demographic and personal history features. Results: Comparing the learners across selected combinations of MoCA scores and thresholds, Naïve Bayes was generally the top-performing learner with an overall classification performance of 0.9093. Further, among the top three learners, MemTrax-based classification performance overall was superior using just the top-ranked four features (0.9119) compared to using all 10 common features (0.8999). Conclusion: MemTrax performance can be effectively utilized in a machine learning classification predictive model screening application for detecting early stage cognitive impairment.

Download Full-text

Surrogate regret bounds for generalized classification performance metrics

Machine Learning ◽

10.1007/s10994-016-5591-7 ◽

2016 ◽

Vol 106 (4) ◽

pp. 549-572 ◽

Cited By ~ 4

Author(s):

Wojciech Kotłowski ◽

Krzysztof Dembczyński

Keyword(s):

Performance Metrics ◽

Classification Performance ◽

Regret Bounds

Download Full-text

Automatic recognition of self-acknowledged limitations in clinical research literature

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy038 ◽

2018 ◽

Vol 25 (7) ◽

pp. 855-861 ◽

Cited By ~ 4

Author(s):

Halil Kilicoglu ◽

Graciela Rosemblat ◽

Mario Malički ◽

Gerben ter Riet

Keyword(s):

Machine Learning ◽

Clinical Research ◽

Binary Classification ◽

Classification Performance ◽

Research Literature ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Rule Based ◽

Research Transparency

Abstract Objective To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency. Methods To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM). Results Annotators had good agreement in labeling limitation sentences (Krippendorff’s α = 0.781). Of the three methods used, the rule-based method yielded the best performance with 91.5% accuracy (95% CI [90.1-92.9]), while self-training with SVM led to a small improvement over fully supervised learning (89.9%, 95% CI [88.4-91.4] vs 89.6%, 95% CI [88.1-91.1]). Conclusions The approach presented can be incorporated into the workflows of stakeholders focusing on research transparency to improve reporting of limitations in clinical studies.

Download Full-text

A Deep Learning Framework for Coronavirus Disease (COVID-19) Detection in X-Ray Images

10.21203/rs.3.rs-26500/v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Tayyip Ozcan

Keyword(s):

Transfer Learning ◽

Performance Metrics ◽

Experimental Studies ◽

Large Family ◽

Classification Performance ◽

The Novel ◽

X Ray ◽

Learning Framework ◽

Classification Feature ◽

Novel Coronavirus

Abstract Coronavirus, a large family of viruses, causes illness in both humans and animals. The novel coronavirus (COVID-19) came up in Wuhan in December 2019. This deadly COVID-19 pandemic has become very fast-spreading and currently present in several countries worldwide. The timely detection of patients who have COVID-19 is vitally important. To this end, scientists are working on different detection methods.In this paper, a grid search (GS) and pre-trained model aided convolutional neural network (CNN) model is proposed to detect COVID-19 in X-Ray images. In the proposed method, the GS method is employed to optimize the hyperparameters of CNN, which directly affects classification performance. Three pre-trained CNN models (GoogleNet, ResNet18 and ResNet50), which can be used for classification, feature extraction and transfer learning purposes were used for transfer learning in this study. The proposed method was trained using the training and validation subdatasets of the collected dataset and detail evaluations are presented according to different performance metrics. According to the experimental studies, the best results were obtained with the GS and ResNet50 aided model.

Download Full-text