Block Classification of a Web Page by Using a Combination of Multiple Classifiers

Motivated by applying Text Categorization to classification of Web search results, this paper describes an extensive experimental study of the impact of bag-of- words document representations on the performance of five major classifiers - Na?ve Bayes, SVM, Voted Perceptron, kNN and C4.5. The texts, representing short Web-page descriptions sorted into a large hierarchy of topics, are taken from the dmoz Open Directory Web-page ontology, and classifiers are trained to automatically determine the topics which may be relevant to a previously unseen Web-page. Different transformations of input data: stemming, normalization, logtf and idf, together with dimensionality reduction, are found to have a statistically significant improving or degrading effect on classification performance measured by classical metrics - accuracy, precision, recall, F1 and F2. The emphasis of the study is not on determining the best document representation which corresponds to each classifier, but rather on describing the effects of every individual transformation on classification, together with their mutual relationships. .

Download Full-text

Classification of CT images in COVID-19 and Non-COVID-19 using CNN to extract features and multiple classifiers

10.5753/ercemapi.2020.11493 ◽

2020 ◽

Author(s):

Edelson Carvalho ◽

Edson Carvalho

Keyword(s):

Ct Images ◽

Multiple Classifiers

O COVID-19 é uma doença respiratória que já infectou mais de 12.3 milhões de pessoas em todo o mundo e é responsável por mais de 556.300 mortes. O diagnóstico precoce do COVID-19 é essencial para a cura e controle da doença. A tomografia computadorizada (TC) apresentou resultados eficientes na avaliação de pacientes com suspeita de infecção por COVID-19. A análise da TC requer o esforço de um especialista, o que pode levar a erros de diagnóstico. O uso de sistemas de diagnóstico auxiliado por computador pode minimizar os problemas gerados pela análise de TCs por especialistas. Este artigo apresenta uma metodologia para diagnosticar a COVID-19 usando CNN para extração de características e múltiplos classificadores em imagens de TC. A metodologia apresentou uma acurácia de 99,79%, recall de 99,79%, precisão de 99,80%, F-score de 0,997, AUC de 0,997 e índice kappa de 0,995. Os resultados obtidos mostram que a metodologia proposta pode ser utilizada como um sistema de auxílio ao diagnóstico.

Download Full-text

Ensemble Deep Learning for Multilabel Binary Classification of User-Generated Content

Algorithms ◽

10.3390/a13040083 ◽

2020 ◽

Vol 13 (4) ◽

pp. 83 ◽

Cited By ~ 5

Author(s):

Giannis Haralabopoulos ◽

Ioannis Anagnostopoulos ◽

Derek McAuley

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Ensemble Learning ◽

Affective Computing ◽

Binary Classification ◽

User Generated Content ◽

Emotional Information ◽

Multiple Classifiers ◽

Different Levels

Sentiment analysis usually refers to the analysis of human-generated content via a polarity filter. Affective computing deals with the exact emotions conveyed through information. Emotional information most frequently cannot be accurately described by a single emotion class. Multilabel classifiers can categorize human-generated content in multiple emotional classes. Ensemble learning can improve the statistical, computational and representation aspects of such classifiers. We present a baseline stacked ensemble and propose a weighted ensemble. Our proposed weighted ensemble can use multiple classifiers to improve classification results without hyperparameter tuning or data overfitting. We evaluate our ensemble models with two datasets. The first dataset is from Semeval2018-Task 1 and contains almost 7000 Tweets, labeled with 11 sentiment classes. The second dataset is the Toxic Comment Dataset with more than 150,000 comments, labeled with six different levels of abuse or harassment. Our results suggest that ensemble learning improves classification results by 1.5 % to 5.4 % .

Download Full-text