Multi-Level Comparison of Machine Learning Classifiers and Their Performance Metrics

Machine learning classification algorithms are widely used for the prediction and classification of the different properties of molecules such as toxicity or biological activity. the prediction of toxic vs. non-toxic molecules is important due to testing on living animals, which has ethical and cost drawbacks as well. The quality of classification models can be determined with several performance parameters. which often give conflicting results. In this study, we performed a multi-level comparison with the use of different performance metrics and machine learning classification methods. Well-established and standardized protocols for the machine learning tasks were used in each case. The comparison was applied to three datasets (acute and aquatic toxicities) and the robust, yet sensitive, sum of ranking differences (SRD) and analysis of variance (ANOVA) were applied for evaluation. The effect of dataset composition (balanced vs. imbalanced) and 2-class vs. multiclass classification scenarios was also studied. Most of the performance metrics are sensitive to dataset composition, especially in 2-class classification problems. The optimal machine learning algorithm also depends significantly on the composition of the dataset.

Download Full-text

Multi-label classification approach for quranic verses labeling

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i1.pp484-490 ◽

2021 ◽

Vol 24 (1) ◽

pp. 484

Author(s):

Abdullahi Adeleke ◽

Noor Azah Samsudin ◽

Mohd Hisyam Abdul Rahim ◽

Shamsul Kamal Ahmad Khalid ◽

Riswan Efendi

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Research Work ◽

Support Vector ◽

Classification Problems ◽

K Nearest Neighbors ◽

Training Systems ◽

Learning Tasks ◽

Binary Relevance ◽

Problem Transformation

Machine learning involves the task of training systems to be able to make decisions without being explicitly programmed. Important among machine learning tasks is classification involving the process of training machines to make predictions from predefined labels. Classification is broadly categorized into three distinct groups: single-label (SL), multi-class, and multi-label (ML) classification. This research work presents an application of a multi-label classification (MLC) technique in automating Quranic verses labeling. MLC has been gaining attention in recent years. This is due to the increasing amount of works based on real-world classification problems of multi-label data. In traditional classification problems, patterns are associated with a single-label from a set of disjoint labels. However, in MLC, an instance of data is associated with a set of labels. In this paper, three standard <em>MLC</em> methods: <span>binary relevance (BR), classifier chain (CC), and label powerset (LP) algorithms are implemented with four baseline classifiers: support vector machine (SVM), naïve Bayes (NB), k-nearest neighbors (k-NN), and J48. The research methodology adopts the multi-label problem transformation (PT) approach. The results are validated using six conventional performance metrics. These include: hamming loss, accuracy, one error, micro-F1, macro-F1, and avg. precision. From the results, the classifiers effectively achieved above 70% accuracy mark. Overall, SVM achieved the best results with CC and LP algorithms.</span>

Download Full-text

Feasibility study of a method for identification and classification of magnesium and aluminum with ME-XRT

Journal of Instrumentation ◽

10.1088/1748-0221/16/11/p11041 ◽

2021 ◽

Vol 16 (11) ◽

pp. P11041

Author(s):

Y. Yeyu ◽

J. Wenbao ◽

H. Daqian ◽

S. Aiyun ◽

C. Can ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Photon Counting ◽

Scrap Metal ◽

X Ray ◽

Machine Learning Classification ◽

Fine Grained ◽

Photon Counting Detector ◽

Metal Recycling

Abstract The identification of magnesium and aluminum in scrap metal recycling has always been a difficult point. In this paper, a material identification method of multi-energy X-ray transmission (ME-XRT) based on photon counting detector (PCD) and machine learning algorithm was proposed and used to identify and classify magnesium and aluminum. This method includes three main steps: using PCD to obtain X-ray attenuation images of five energy bins, feature extraction, and the machine learning classification. The performance of several machine learning models was compared for the fine-grained classification task. The prediction results demonstrate that the best achieved recognition rates of aluminum and magnesium are 96.43% and 98.81%, respectively.

Download Full-text

Machine Learning Classification of Spinal Lesions: Compared Accuracy of Texture Parameters Extracted by Different Software

10.1055/s-0039-1692578 ◽

2019 ◽

Author(s):

V. Chianca ◽

D. Albano ◽

R. Cuocolo ◽

C. Messina ◽

S. Gitto ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Classification ◽

Spinal Lesions ◽

Texture Parameters

Download Full-text

Machine Learning Classification of Low-grade and High-grade Chondrosarcomas Based on MRI-based Texture Analysis

10.1055/s-0039-1692575 ◽

2019 ◽

Author(s):

S. Gitto ◽

D. Albano ◽

V. Chianca ◽

R. Cuocolo ◽

L. Ugga ◽

...

Keyword(s):

Machine Learning ◽

Texture Analysis ◽

Low Grade ◽

High Grade ◽

Machine Learning Classification

Download Full-text

Multiclass machine learning classification of functional brain images for Parkinson's disease stage prediction

Statistical Analysis and Data Mining The ASA Data Science Journal ◽

10.1002/sam.11480 ◽

2020 ◽

Vol 13 (5) ◽

pp. 508-523 ◽

Cited By ~ 1

Author(s):

Guan‐Hua Huang ◽

Chih‐Hsuan Lin ◽

Yu‐Ren Cai ◽

Tai‐Been Chen ◽

Shih‐Yen Hsu ◽

...

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Parkinson's Disease ◽

Disease Stage ◽

Brain Images ◽

Functional Brain ◽

Machine Learning Classification

Download Full-text

Machine learning algorithm improved automated droplet classification of ddPCR for detection of BRAF V600E in paraffin-embedded samples

Scientific Reports ◽

10.1038/s41598-021-92014-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Gabriel A. Colozza-Gama ◽

Fabiano Callegari ◽

Nikola Bešič ◽

Ana C. de J. Paviza ◽

Janete M. Cerutti

Keyword(s):

Machine Learning ◽

Sanger Sequencing ◽

Learning Algorithm ◽

Absolute Quantification ◽

Braf V600e Mutation ◽

Braf V600e ◽

Driver Genes ◽

Quantitative Classification ◽

Cancer Driver

AbstractSomatic mutations in cancer driver genes can help diagnosis, prognosis and treatment decisions. Formalin-fixed paraffin-embedded (FFPE) specimen is the main source of DNA for somatic mutation detection. To overcome constraints of DNA isolated from FFPE, we compared pyrosequencing and ddPCR analysis for absolute quantification of BRAF V600E mutation in the DNA extracted from FFPE specimens and compared the results to the qualitative detection information obtained by Sanger Sequencing. Sanger sequencing was able to detect BRAF V600E mutation only when it was present in more than 15% total alleles. Although the sensitivity of ddPCR is higher than that observed for Sanger, it was less consistent than pyrosequencing, likely due to droplet classification bias of FFPE-derived DNA. To address the droplet allocation bias in ddPCR analysis, we have compared different algorithms for automated droplet classification and next correlated these findings with those obtained from pyrosequencing. By examining the addition of non-classifiable droplets (rain) in ddPCR, it was possible to obtain better qualitative classification of droplets and better quantitative classification compared to no rain droplets, when considering pyrosequencing results. Notable, only the Machine learning k-NN algorithm was able to automatically classify the samples, surpassing manual classification based on no-template controls, which shows promise in clinical practice.

Download Full-text

Handcrafted MRI radiomics and machine learning: Classification of indeterminate solid adrenal lesions

Magnetic Resonance Imaging ◽

10.1016/j.mri.2021.03.009 ◽

2021 ◽

Vol 79 ◽

pp. 52-58

Author(s):

Arnaldo Stanzione ◽

Renato Cuocolo ◽

Francesco Verde ◽

Roberta Galatola ◽

Valeria Romeo ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Classification

Download Full-text

Multi-Class Assessment Based on Random Forests

Education Sciences ◽

10.3390/educsci11030092 ◽

2021 ◽

Vol 11 (3) ◽

pp. 92

Author(s):

Mehdi Berriri ◽

Sofiane Djema ◽

Gaëtan Rey ◽

Christel Dartigues-Pallez

Keyword(s):

Higher Education ◽

Machine Learning ◽

Random Forests ◽

Learning Algorithm ◽

Teaching Staff ◽

Machine Learning Algorithm ◽

Process Data ◽

Training Courses ◽

Education Courses

Today, many students are moving towards higher education courses that do not suit them and end up failing. The purpose of this study is to help provide counselors with better knowledge so that they can offer future students courses corresponding to their profile. The second objective is to allow the teaching staff to propose training courses adapted to students by anticipating their possible difficulties. This is possible thanks to a machine learning algorithm called Random Forest, allowing for the classification of the students depending on their results. We had to process data, generate models using our algorithm, and cross the results obtained to have a better final prediction. We tested our method on different use cases, from two classes to five classes. These sets of classes represent the different intervals with an average ranging from 0 to 20. Thus, an accuracy of 75% was achieved with a set of five classes and up to 85% for sets of two and three classes.

Download Full-text