scholarly journals Implementation of Classification Algorithms in Neo4j using IPL data

2021 ◽  
Vol 10 (11) ◽  
pp. 25431-25441
Author(s):  
Surajit Medhi ◽  
Hemanta K. Baruah

The main objective of this paper is to implement the classifications algorithms in Neo4j graph database using cypher query language. For implementing the classification algorithm, we have used Indian Premier League (IPL) dataset to predict the winner of the matches using some different features. The IPL is the most popular T20 cricket league in the world. The prediction models are based on the city where the matches were played, winner of the toss and decision of the toss.  In this paper we have implemented Naïve Bayes and K-Nearest Neighbors (KNN) classification algorithms using cypher query language. Different classifiers are used to predict the outcome of different games like football, volleyball, cricket etc, using python and R. In this paper we shall use cypher query language. We shall also compare and analysis the results which are given by Naïve Bayes and K-Nearest Neighbors algorithms to predict the winner of the matches.

2021 ◽  
Author(s):  
Leonardo Dias Martins ◽  
Fabíola Pantoja Oliveira Araújo

Daily, a large amount of data circulates on the Internet, producing a lot of information in the form of images, videos and texts. Then, it is necessary to analyze and extract these information automatically. Therefore, this work presents a case study that applies text mining to extract the emotional and sentimental profiles from the comments of the Last Day of June game users, where the results and the information extracted from the analysis of sentiments were presented. Three classification algorithms were used: Naive Bayes, Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) to predict the class of elements according to the emotions or feelings identified in the comments analysis. As a result, SVM with radial kernel was the one with the best accuracy, with 79%, followed by KNN with 3 closest neighbors, with 75%, and finally, Naive Bayes, with 62%.


Author(s):  
Ángel Freddy Godoy Viera

Las técnicas de aprendizaje de máquina continúan siendo muy utilizadas para la minería de texto. Para este artículo se realizó una revisión de literatura en periódicos científicos publicados en los años de 2010 y 2011, con el objetivo de identificar las principales formas de aprendizaje de máquina empleadas para la minería de texto. Se utilizó estadística descriptiva para organizar, resumir y analizar los datos encontrados, y se presentó una descripción resumida de las principales encontradas. En los artículos analizados se hallaron 13 aplicadas para la minería de texto, el 83% de los artículos mencionaban de 1 a 3 técnicas de aprendizaje de máquina, las principales usadas por los autores en los artículos estudiados fueron support vector machine (svm), k-means (k-m),k-nearest neighbors (k-nn), naive bayes (nb), self-organizing maps (som). Los pares que aparecen con mayor frecuencia son svm/nb, svm/k-nn, svm/decission tree.


Kilat ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 169-178
Author(s):  
Wulan Wulandari

Competition for new student admissions in every public and private tertiary institution is currently growing rapidly every year, some spend a lot of money on promotional activities, to assist institutions / institutions in obtaining recommendations for the feasibility of promotion locations based on several measurement criteria using the classification algorithms contained in data mining . The algorithm used to compare the measurement of the feasibility of the promotion location of the city and district of Bekasi is Naïve Bayes and Decission Tree C4.5 using four parameters including the number of students in one sub-district, the number of students in one sub-district, the distance of location and last year's enthusiasts using 35 regions / sub-districts in Bekasi city and district.  measurement results using the rapidminner, the accuracy value of the Naïve Bayes algorithm is 91.43% and the Decission Tree C4.5 is 94.29%.


2021 ◽  
Vol 1 (1) ◽  
pp. 14-20
Author(s):  
Tommy Tommy ◽  
Amir Mahmud Husein

Perguruan tinggi merupakan satuan penyelenggara pendidikan tinggi sebagai tingkat lanjut jenjang pendidikan menengah di jalur pendidikan formal. Aspek prestasi belajar merupakan salah satu aspek penilaian keberhasilan perguruan tinggi dalam proses belajar. Dalam makalah ini menyajikan hasil analisis hubungan antara pembelajaran dengan prestasi mahasiswa dimana tahapan yang dilakukan menggunakan pendetakan data science. Berdasarkan Analisis data terdapat tiga indikator penting dalam penilaian prestasi belajar yaitu pedagogi, profesional dan kepribadian. Ketiga fitur digunakan sebagai variabel dependen untuk memprediksi prestasi belajar dimana algoritma DecisionTree menghasilkan akurasi lebih baik dari pada model k-nearest neighbors (KNN), Logistic Regression, Support Vector Machine, Naive Bayes dan dengan tingkat akurasi 68%, kemudian KNN dengan akurasi 66% dan lainnya sebesar 55% pada masing-masing algoritma yang diusulkan.


2019 ◽  
Vol 886 ◽  
pp. 221-226 ◽  
Author(s):  
Kesinee Boonchuay

Sentiment classification gains a lot of attention nowadays. For a university, the knowledge obtained from classifying sentiments of student learning in courses is highly valuable, and can be used to help teachers improve their teaching skills. In this research, sentiment classification based on text embedding is applied to enhance the performance of sentiment classification for Thai teaching evaluation. Text embedding techniques considers both syntactic and semantic elements of sentences that can be used to improve the performance of the classification. This research uses two approaches to apply text embedding for classification. The first approach uses fastText classification. According to the results, fastText provides the best overall performance; its highest F-measure was at 0.8212. The second approach constructs text vectors for classification using traditional classifiers. This approach provides better performance over TF-IDF for k-nearest neighbors and naïve Bayes. For naïve Bayes, the second approach yields the best performance of geometric mean at 0.8961. The performance of TF-IDF is better suited to using decision tree than the second approach. The benefit of this research is that it presents the workflow of using text embedding for Thai teaching evaluation to improve the performance of sentiment classification. By using embedding techniques, similarity and analogy tasks of texts are established along with the classification.


Sign in / Sign up

Export Citation Format

Share Document