An Enhanced Corpus for Arabic Newspapers Comments

In this paper, we propose our enhanced approach to create a dedicated corpus for Algerian Arabic newspapers comments. The developed approach has to enhance an existing approach by the enrichment of the available corpus and the inclusion of the annotation step by following the Model Annotate Train Test Evaluate Revise (MATTER) approach. A corpus is created by collecting comments from web sites of three well know Algerian newspapers. Three classifiers, support vector machines, naïve Bayes, and k-nearest neighbors, were used for classification of comments into positive and negative classes. To identify the influence of the stemming in the obtained results, the classification was tested with and without stemming. Obtained results show that stemming does not enhance considerably the classification due to the nature of Algerian comments tied to Algerian Arabic Dialect. The promising results constitute a motivation for us to improve our approach especially in dealing with non Arabic sentences, especially Dialectal and French ones

Download Full-text

Selecting Features Subsets Based on Support Vector Machine-Recursive Features Elimination and One Dimensional-Naïve Bayes Classifier using Support Vector Machines for Classification of Prostate and Breast Cancer

Procedia Computer Science ◽

10.1016/j.procs.2019.08.238 ◽

2019 ◽

Vol 157 ◽

pp. 450-458 ◽

Cited By ~ 1

Author(s):

Alhadi Bustamam ◽

Anas Bachtiar ◽

Devvi Sarwinda

Keyword(s):

Breast Cancer ◽

Support Vector Machine ◽

Support Vector Machines ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Bayes Classifier ◽

One Dimensional ◽

Vector Machines

Download Full-text

A Clinical Decision Support Tool to Detect Invasive Ductal Carcinoma in Histopathological Images Using Support Vector Machines, Naïve-Bayes, and K-Nearest Neighbor Classifiers

Machine Learning and Artificial Intelligence - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200765 ◽

2020 ◽

Author(s):

Kyra Mikaela M. Lopez ◽

Ma. Sheila A. Magboo

Keyword(s):

Support Vector Machines ◽

Invasive Ductal Carcinoma ◽

Naive Bayes ◽

Ductal Carcinoma ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbors ◽

Support Tool ◽

Vector Machines

This study aims to describe a model that will apply image processing and traditional machine learning techniques specifically Support Vector Machines, Naïve-Bayes, and k-Nearest Neighbors to identify whether or not a given breast histopathological image has Invasive Ductal Carcinoma (IDC). The dataset consisted of 54,811 breast cancer image patches of size 50px x 50px, consisting of 39,148 IDC negative and 15,663 IDC positive. Feature extraction was accomplished using Oriented FAST and Rotated BRIEF (ORB) descriptors. Feature scaling was performed using Min-Max Normalization while K-Means Clustering on the ORB descriptors was used to generate the visual codebook. Automatic hyperparameter tuning using Grid Search Cross Validation was implemented although it can also accept user supplied hyperparameter values for SVM, Naïve Bayes, and K-NN models should the user want to do experimentation. Aside from computing for accuracy, the AUPRC and MCC metrics were used to address the dataset imbalance. The results showed that SVM has the best overall performance, obtaining accuracy = 0.7490, AUPRC = 0.5536, and MCC = 0.2924.

Download Full-text

Avaliando atributos para a classificação de estrutura retórica em resumos científicos

Linguamática ◽

10.21814/lm.11.1.273 ◽

2019 ◽

Vol 11 (1) ◽

pp. 41-53

Author(s):

Alessandra Harumi Iriguti ◽

Valéria Delisandra Feltrim

Keyword(s):

Support Vector Machines ◽

Decision Trees ◽

Random Fields ◽

Conditional Random Fields ◽

Naive Bayes ◽

Nearest Neighbors ◽

Support Vector ◽

Word Embeddings ◽

K Nearest Neighbors ◽

Vector Machines

A classificação de estrutura retórica é uma tarefa de PLN na qual se busca identificar os componentes retóricos de um discurso e seus relacionamentos. No caso deste trabalho, buscou-se identificar automaticamente categorias em nível de sentenças que compõem a estrutura retórica de resumos científicos. Especificamente, o objetivo foi avaliar o impacto de diferentes conjuntos de atributos na implementação de classificadores retóricos para resumos científicos escritos em português. Para isso, foram utilizados atributos superficiais (extraídos como valores TF-IDF e selecionados com o teste chi-quadrado), atributos morfossintáticos (implementados pelo classificador AZPort) e atributos extraídos a partir de modelos de word embeddings (Word2Vec, Wang2Vec e GloVe, todos previamente treinados). Tais conjuntos de atributos, bem como as suas combinações, foram usados para o treinamento de classificadores usando os seguintes algoritmos de aprendizado supervisionado: Support Vector Machines, Naive Bayes, K-Nearest Neighbors, Decision Trees e Conditional Random Fields (CRF). Os classificadores foram avaliados por meio de validação cruzada sobre três corpora compostos por resumos de teses e dissertações. O melhor resultado, 94% de F1, foi obtido pelo classificador CRF com as seguintes combinações de atributos: (i) Wang2Vec--Skip-gram de dimensões 100 com os atributos provenientes do AZPort; (ii) Wang2Vec--Skip-gram e GloVe de dimensão 300 com os atributos do AZPort; (iii) TF-IDF, AZPort e embeddings extraídos com os modelos Wang2Vec--Skip-gram de dimensões 100 e 300 e GloVe de dimensão 300. A partir dos resultados obtidos, conclui-se que os atributos provenientes do classificador AZPort foram fundamentais para o bom desempenho do classificador CRF, enquanto que a combinação com word embeddings se mostrou válida para a melhoria dos resultados.

Download Full-text

Comparison of Multinomial Naïve Bayes with K-Nearest Neighbors, Support Vector Machine and Random Forest for Classification of “Network Attacks” Document

2019 Fourth International Conference on Informatics and Computing (ICIC) ◽

10.1109/icic47613.2019.8985919 ◽

2019 ◽

Author(s):

Bambang Harjito ◽

Ardhi Wijayanto ◽

Kuni Nur Aini ◽

Budi Murtiyas

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Naive Bayes ◽

Nearest Neighbors ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbors ◽

Network Attacks

Download Full-text

Persian Handwritten Number Recognition Using Adapted Framing Feature and Support Vector Machines

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026816500048 ◽

2016 ◽

Vol 15 (01) ◽

pp. 1650004 ◽

Cited By ~ 3

Author(s):

Hedieh Sajedi ◽

Mehran Bahador

Keyword(s):

Support Vector Machines ◽

Recognition Rate ◽

Nearest Neighbors ◽

Polynomial Kernel ◽

Support Vector ◽

K Nearest Neighbors ◽

New Approach ◽

Number Recognition ◽

Vector Machines

In this paper, a new approach for segmentation and recognition of Persian handwritten numbers is presented. This method utilizes the framing feature technique in combination with outer profile feature that we named this the adapted framing feature. In our proposed approach, segmentation of the numbers into digits has been carried out automatically. In the classification stage of the proposed method, Support Vector Machines (SVM) and k-Nearest Neighbors (k-NN) are used. Experimentations are conducted on the IFHCDB database consisting 17,740 numeral images and HODA database consisting 102,352 numeral images. In isolated digit level on IFHCDB, the recognition rate of 99.27%, is achieved by using SVM with polynomial kernel. Furthermore, in isolated digit level on HODA, the recognition rate of 99.07% is achieved by using SVM with polynomial kernel. The experiments illustrate that applying our proposed method resulted higher accuracy compared to previous researches.

Download Full-text

Comparison of Linear Discriminant Analysis, Support Vector Machines and Naive Bayes Methods in the Classification of Neonatal Hyperspectral Signatures

2021 29th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu53274.2021.9477861 ◽

2021 ◽

Author(s):

Mucahit Cihan ◽

Murat Ceylan

Keyword(s):

Support Vector Machines ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Naive Bayes ◽

Support Vector ◽

Linear Discriminant ◽

Bayes Methods ◽

Vector Machines ◽

Hyperspectral Signatures

Download Full-text

Epileptic Seizure Detection from EEG Signals Using Best Feature Subsets Based on Estimation of Mutual Information for Support Vector Machines and Naïve Bayes Classifiers

Advances in Systems, Control and Automation - Lecture Notes in Electrical Engineering ◽

10.1007/978-981-10-4762-6_56 ◽

2017 ◽

pp. 585-593

Author(s):

A. Sharmila ◽

P. Geethanjali

Keyword(s):

Support Vector Machines ◽

Mutual Information ◽

Epileptic Seizure ◽

Naive Bayes ◽

Seizure Detection ◽

Naïve Bayes ◽

Support Vector ◽

Eeg Signals ◽

Epileptic Seizure Detection ◽

Vector Machines

Download Full-text

Técnicas de aprendizaje de máquina utilizadas para la minería de texto

Investigación Bibliotecológica Archivonomía Bibliotecología e Información ◽

10.22201/iibi.0187358xp.2017.71.57812 ◽

2017 ◽

Vol 31 (71) ◽

pp. 103

Author(s):

Ángel Freddy Godoy Viera

Keyword(s):

Support Vector Machine ◽

Naive Bayes ◽

Nearest Neighbors ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbors ◽

Self Organizing Maps ◽

Self Organizing

Las técnicas de aprendizaje de máquina continúan siendo muy utilizadas para la minería de texto. Para este artículo se realizó una revisión de literatura en periódicos científicos publicados en los años de 2010 y 2011, con el objetivo de identificar las principales formas de aprendizaje de máquina empleadas para la minería de texto. Se utilizó estadística descriptiva para organizar, resumir y analizar los datos encontrados, y se presentó una descripción resumida de las principales encontradas. En los artículos analizados se hallaron 13 aplicadas para la minería de texto, el 83% de los artículos mencionaban de 1 a 3 técnicas de aprendizaje de máquina, las principales usadas por los autores en los artículos estudiados fueron support vector machine (svm), k-means (k-m),k-nearest neighbors (k-nn), naive bayes (nb), self-organizing maps (som). Los pares que aparecen con mayor frecuencia son svm/nb, svm/k-nn, svm/decission tree.

Download Full-text