scholarly journals Predicting Student’s Performance Using Machine Learning Algorithm

Author(s):  
Sheela Rani P ◽  
Dhivya S ◽  
Dharshini Priya M ◽  
Dharmila Chowdary A

Machine learning is a new analysis discipline that uses knowledge to boost learning, optimizing the training method and developing the atmosphere within which learning happens. There square measure 2 sorts of machine learning approaches like supervised and unsupervised approach that square measure accustomed extract the knowledge that helps the decision-makers in future to require correct intervention. This paper introduces an issue that influences students' tutorial performance prediction model that uses a supervised variety of machine learning algorithms like support vector machine , KNN(k-nearest neighbors), Naïve Bayes and supplying regression and logistic regression. The results supported by various algorithms are compared and it is shown that the support vector machine and Naïve Bayes performs well by achieving improved accuracy as compared to other algorithms. The final prediction model during this paper may have fairly high prediction accuracy .The objective is not just to predict future performance of students but also provide the best technique for finding the most impactful features that influence student’s while studying.

Author(s):  
Harshal Surve ◽  
Aditya Mestry

Sarcasm is the use of words usually used to indirectly either mock or annoy someone, or for humorous purposes. One of the difficult modes of communication for machines to identify is sarcasm. People often use sarcasm in their daily communication to indirectly annoy people which makes it very important to identify the sentence meaning. There are various machine learning algorithms for sarcasm detection such as Naïve Bayes (NB), Support Vector Machine (SVM), Logistics Regression (LR), Decision Trees (DT).All these algorithm can be used for Sarcasm Detection. The main goal of this paper is to provide various machine learning algorithms for sarcasm detection.


2020 ◽  
Vol 4 (1) ◽  
pp. 96
Author(s):  
Haidar Abdulrahman Abbas ◽  
Kayhan Zrar Ghafoor

In this paper, fingerprint referencing methods based on wireless fidelity Wi-Fi received signal strength (RSS) have used for indoor positioning. More precisely, Naïve Bayes, decision tree (DT), and support vector machine (SVM) one-to-one multi-classes and error-correcting-output-codes classifier are to enable accurate indoor positioning. Then, normalization is used to reduce positioning error by reducing the fluctuation and diverse distribution of the RSS values. Different devices are used in this experiment; the training dataset is not included in the main dataset. Nonetheless, the learned model by the SVM algorithm cannot be affected by the elimination of train datasets of the test device. The efficiency of DT is lower than the other machine learning algorithms, because it performs by Boolean function, and it provides the low accuracy of prediction for dataset than the algorithms. Naïve Bayes technique based on Bayes Theorem is better than DT and close to SVM for positioning approves that 1–1.5 m positioning accuracy for indoor environments can be achieved by the proposed approach which is an excellent result than traditional protocol.


Author(s):  
Branislava Cvijetic ◽  
Zaharije Radivojevic

Institutions that provide official statistics tend to use external data sources such as administrative data sources besides regular statistical surveys. In addition to the mentioned data sources, Big Data became recognized as a new data source for the provider of official statistics. Classification of textual data is one of the elementary tasks for the provider of official statistics, regardless of data sources. In this paper, application of traditional machine learning algorithms, Multinomial Naive Bayes and Support Vector Machine, for the classification of advertised jobs according to ISCO-08, has been presented. The paper presents the methods of collecting data on advertised jobs from four websites and procedures for creating a multilingual dataset. There are different types of text preprocessing, such as converting uppercase letters into lowercase letters, stopword removal, punctuation mark removal, lemmatization, correction of commonly misspelled words, and reduction of replicated characters. We hypothesized that the application of different combinations of preprocessing methods influenced the text classification results. Two experiments had conducted to test the hypothesis. Both experiments results showed that using the Support Vector Machine algorithm on a created dataset gives better results than Multinomial Naive Bayes. Performed experiments showed that the proposed algorithms gave a good performance with an overall accuracy of up to 90% but with different accuracy for individual classes due to an imbalanced dataset.


Author(s):  
Muskan Patidar

Abstract: Social networking platforms have given us incalculable opportunities than ever before, and its benefits are undeniable. Despite benefits, people may be humiliated, insulted, bullied, and harassed by anonymous users, strangers, or peers. Cyberbullying refers to the use of technology to humiliate and slander other people. It takes form of hate messages sent through social media and emails. With the exponential increase of social media users, cyberbullying has been emerged as a form of bullying through electronic messages. We have tried to propose a possible solution for the above problem, our project aims to detect cyberbullying in tweets using ML Classification algorithms like Naïve Bayes, KNN, Decision Tree, Random Forest, Support Vector etc. and also we will apply the NLTK (Natural language toolkit) which consist of bigram, trigram, n-gram and unigram on Naïve Bayes to check its accuracy. Finally, we will compare the results of proposed and baseline features with other machine learning algorithms. Findings of the comparison indicate the significance of the proposed features in cyberbullying detection. Keywords: Cyber bullying, Machine Learning Algorithms, Twitter, Natural Language Toolkit


Author(s):  
Ángel Freddy Godoy Viera

Las técnicas de aprendizaje de máquina continúan siendo muy utilizadas para la minería de texto. Para este artículo se realizó una revisión de literatura en periódicos científicos publicados en los años de 2010 y 2011, con el objetivo de identificar las principales formas de aprendizaje de máquina empleadas para la minería de texto. Se utilizó estadística descriptiva para organizar, resumir y analizar los datos encontrados, y se presentó una descripción resumida de las principales encontradas. En los artículos analizados se hallaron 13 aplicadas para la minería de texto, el 83% de los artículos mencionaban de 1 a 3 técnicas de aprendizaje de máquina, las principales usadas por los autores en los artículos estudiados fueron support vector machine (svm), k-means (k-m),k-nearest neighbors (k-nn), naive bayes (nb), self-organizing maps (som). Los pares que aparecen con mayor frecuencia son svm/nb, svm/k-nn, svm/decission tree.


2020 ◽  
Vol 19 ◽  
pp. 153303382090982
Author(s):  
Melek Akcay ◽  
Durmus Etiz ◽  
Ozer Celik ◽  
Alaattin Ozen

Background and Aim: Although the prognosis of nasopharyngeal cancer largely depends on a classification based on the tumor-lymph node metastasis staging system, patients at the same stage may have different clinical outcomes. This study aimed to evaluate the survival prognosis of nasopharyngeal cancer using machine learning. Settings and Design: Original, retrospective. Materials and Methods: A total of 72 patients with a diagnosis of nasopharyngeal cancer who received radiotherapy ± chemotherapy were included in the study. The contribution of patient, tumor, and treatment characteristics to the survival prognosis was evaluated by machine learning using the following techniques: logistic regression, artificial neural network, XGBoost, support-vector clustering, random forest, and Gaussian Naive Bayes. Results: In the analysis of the data set, correlation analysis, and binary logistic regression analyses were applied. Of the 18 independent variables, 10 were found to be effective in predicting nasopharyngeal cancer-related mortality: age, weight loss, initial neutrophil/lymphocyte ratio, initial lactate dehydrogenase, initial hemoglobin, radiotherapy duration, tumor diameter, number of concurrent chemotherapy cycles, and T and N stages. Gaussian Naive Bayes was determined as the best algorithm to evaluate the prognosis of machine learning techniques (accuracy rate: 88%, area under the curve score: 0.91, confidence interval: 0.68-1, sensitivity: 75%, specificity: 100%). Conclusion: Many factors affect prognosis in cancer, and machine learning algorithms can be used to determine which factors have a greater effect on survival prognosis, which then allows further research into these factors. In the current study, Gaussian Naive Bayes was identified as the best algorithm for the evaluation of prognosis of nasopharyngeal cancer.


JURTEKSI ◽  
2021 ◽  
Vol 8 (1) ◽  
pp. 11-18
Author(s):  
Chika Enggar Puspita ◽  
Oktariani Nurul Pratiwi ◽  
Edi Sutoyo

Abstract: Question classification is a computer science system, which aims to analyze questions and can label each question based on existing categories. Questions can be collected from several materials or topics that are many and different. Therefore, the researcher intends to create a classification system for quiz questions Data Warehouse and Business Intelligence which can be grouped into topics Data Warehouse, Business Intelligence, Data Analytics, and Performance Measurement. One way to solve this problem is by approach machine learning. In this study, researchers used a comparison of machine learning algorithms, namely the algorithm NaïveBayes and SupportVectorMachine using SMOTE and methods Cross-Validation The results of this study show the best accuracy results and are very helpful. The results obtained in the method cross-validation before SMOTE resulted in an accuracy rate of 82.02% for the results after going through the SMOTE stage of 94.79% on the algorithm Naïve Bayes, while the algorithm SupportVectorMachine get accuracy of 81.39% in the process before SMOTE for the results after going through SMOTE of 96.52%.  Keywords: Cross-Validation; Machine Learning; Naive Bayes; Support Vector Machine; Question Classification  Abstrak: Klasifikasi pertanyaan merupakan sebuah sistem ilmu komputer, yang bertujuan untuk menganalisis pertanyaan serta dapat memberi label pada setiap pertanyaan berdasarkan kategori yang ada. Pertanyaan soal dapat dikumpulkan dari beberapa materi atau topik yang banyak dan berbeda. Oleh karena itu, bermaksud untuk membuat sistem klasifikasi pertanyaan soal kuis Data Warehouse dan Business Intelligence yang dapat dikelompokkan menjadi topik Data Warehouse, Business Intelligence, Data Analitik, dan Pengukuran Kinerja. Cara  yang dapat dilakukan untuk permasalahan ini dengan menggunakan pendekatan MachineLearning. Pada penelitian kali ini menggunakan perbandingan algoritma MachineLearning yaitu algoritma NaïveBayes dan SupportVectorMachine menggunakan metode SMOTE dan Cross-Validation. Hasil penelitian ini menunjukkan hasil akurasi yang terbaik dan sangat membantu. Hasil yang diperoleh pada metode cross-validation sebelum SMOTE menghasilkan tingkat akurasi sebesar 82.02% untuk hasil sesudah melalui tahap SMOTE sebesar 94.79 %  pada algoritma Naïve Bayes, sedangkan pada algoritma Support Vector Machine menghasilkan akurasi sebesar pada proses sebelum SMOTE 81.39% untuk hasil sesudah melalui SMOTE sebesar 96.52%. Kata kunci: Klasifikasi Pertanyaan; Pembelajaran Mesin; Naive Bayes; Support Vector Machine; Cross-Validation


Diabetes is a most common disease that occurs to most of the humans now a day. The predictions for this disease are proposed through machine learning techniques. Through this method the risk factors of this disease are identified and can be prevented from increasing. Early prediction in such disease can be controlled and save human’s life. For the early predictions of this disease we collect data set having 8 attributes diabetic of 200 patients. The patients’ sugar level in the body is tested by the features of patient’s glucose content in the body and according to the age. The main Machine learning algorithms are Support vector machine (SVM), naive bayes (NB), K nearest neighbor (KNN) and Decision Tree (DT). In the exiting the Naive Bayes the accuracy levels are 66% but in the Decision tree the accuracy levels are 70 to 71%. The accuracy levels of the patients are not proper in range. But in XG boost classifiers even after the Naïve Bayes 74 Percentage and in Decision tree the accuracy levels are 89 to 90%. In the proposed system the accuracy ranges are shown properly and this is only used mostly. A dataset of 729 patients can be stored in Mongo DB and in that 129 patients repots are taken for the prediction purpose and the remaining are used for training. The training datasets are used for the prediction purposes.


Author(s):  
Ahmed T. Shawky ◽  
Ismail M. Hagag

In today’s world using data mining and classification is considered to be one of the most important techniques, as today’s world is full of data that is generated by various sources. However, extracting useful knowledge out of this data is the real challenge, and this paper conquers this challenge by using machine learning algorithms to use data for classifiers to draw meaningful results. The aim of this research paper is to design a model to detect diabetes in patients with high accuracy. Therefore, this research paper using five different algorithms for different machine learning classification includes, Decision Tree, Support Vector Machine (SVM), Random Forest, Naive Bayes, and K- Nearest Neighbor (K-NN), the purpose of this approach is to predict diabetes at an early stage. Finally, we have compared the performance of these algorithms, concluding that K-NN algorithm is a better accuracy (81.16%), followed by the Naive Bayes algorithm (76.06%).


Sign in / Sign up

Export Citation Format

Share Document