scholarly journals A COMPARATIVE STUDY OF THE DIFFERENT CLASSIFICATION ALGORITHMS ON FOOTBALL ANALYTICS

2021 ◽  
Vol 9 (08) ◽  
pp. 392-407
Author(s):  
Karan Bhowmick ◽  
Vivek Sarvaiya

Sports analytics is on the rise, with many teams looking to use data science and machine learning algorithms to augment their teams research and boost team performance. This is especially true in the case of Football Clubs. In this work, we have taken the statistics of matches for each team from five major football leagues. These include the English Premier League, La Liga, Serie A, Bundesliga, and Ligue 1. We use this data for two kinds of classification to predict a teams win, loss, or draw. First, we implement Multiclass Classification using Naive Bayes classification, Decision Tree classification, and K-Nearest Neighbours classification. We use f1-score, recall, and precision to evaluate the model. Next, we use Binary Classification to predict if a team wins or does not win, i.e., a loss or a draw. We achieve this by using Support Vector Machines, Logistics Regression, K-Nearest Neighbours classification, Decision Tree classification, and Naive Bayes classification. We evaluate the results using the evaluation metrics mentioned above. Now, we compare the accuracy and efficacy of these algorithms based on the evaluation metrics. This will help standardize the means of classification in sports and football analytics in the future.

Information ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 383
Author(s):  
Francis Effirim Botchey ◽  
Zhen Qin ◽  
Kwesi Hughes-Lartey

The onset of COVID-19 has re-emphasized the importance of FinTech especially in developing countries as the major powers of the world are already enjoying the advantages that come with the adoption of FinTech. Handling of physical cash has been established as a means of transmitting the novel corona virus. Again, research has established that, been unbanked raises the potential of sinking one into abject poverty. Over the years, developing countries have been piloting the various forms of FinTech, but the very one that has come to stay is the Mobile Money Transactions (MMT). As mobile money transactions attempt to gain a foothold, it faces several problems, the most important of them is mobile money fraud. This paper seeks to provide a solution to this problem by looking at machine learning algorithms based on support vector machines (kernel-based), gradient boosted decision tree (tree-based) and Naïve Bayes (probabilistic based) algorithms, taking into consideration the imbalanced nature of the dataset. Our experiments showed that the use of gradient boosted decision tree holds a great potential in combating the problem of mobile money fraud as it was able to produce near perfect results.


Diabetes is a most common disease that occurs to most of the humans now a day. The predictions for this disease are proposed through machine learning techniques. Through this method the risk factors of this disease are identified and can be prevented from increasing. Early prediction in such disease can be controlled and save human’s life. For the early predictions of this disease we collect data set having 8 attributes diabetic of 200 patients. The patients’ sugar level in the body is tested by the features of patient’s glucose content in the body and according to the age. The main Machine learning algorithms are Support vector machine (SVM), naive bayes (NB), K nearest neighbor (KNN) and Decision Tree (DT). In the exiting the Naive Bayes the accuracy levels are 66% but in the Decision tree the accuracy levels are 70 to 71%. The accuracy levels of the patients are not proper in range. But in XG boost classifiers even after the Naïve Bayes 74 Percentage and in Decision tree the accuracy levels are 89 to 90%. In the proposed system the accuracy ranges are shown properly and this is only used mostly. A dataset of 729 patients can be stored in Mongo DB and in that 129 patients repots are taken for the prediction purpose and the remaining are used for training. The training datasets are used for the prediction purposes.


Machine learning is one of the fast growing aspect in current world. Machine learning (ML) and Artificial Neural Network (ANN) are helpful in detection and diagnosis of various heart diseases. Naïve Bayes Classification is a vital approach of classification in machine learning. The heart disease consists of set of range disorders affecting the heart. It includes blood vessel problems such as irregular heart beat issues, weak heart muscles, congenital heart defects, cardio vascular disease and coronary artery disease. Coronary heart disorder is a familiar type of heart disease. It reduces the blood flow to the heart leading to a heart attack. In this paper the UCI machine learning repository data set consisting of patients suffering from heart disease is analyzed using Naïve Bayes classification and support vector machines. The classification accuracy of the patients suffering from heart disease is predicted using Naïve Bayes classification and support vector machines. Implementation is done using R language.


The scope of this research work is to identify the efficient machine learning algorithm for predicting the behavior of a student from the student performance dataset. We applied Support Vector Machines, K-Nearest Neighbor, Decision Tree and Naïve Bayes algorithms to predict the grade of a student and compared their prediction results in terms of various performance metrics. The students who visited many resources for reference, made academic related discussions and interactions in the class room, absent for minimum days, cared by parents care have shown great improvement in the final grade. Among the machine learning techniques we have used, SVM has shown more accuracy in terms of four important attribute. The accuracy rate of SVM after tuning is 0.80. The KNN and decision tree achieves the accuracy of 0.64, 0.65 respectively whereas the Naïve Bayes achieves 0.77.


2021 ◽  
Vol 226 (16) ◽  
pp. 133-140
Author(s):  
Trần Thị Xuân ◽  
Nguyễn Văn Núi

Khai phá dữ liệu là một kỹ thuật phổ biến, được sử dụng để trích xuất thông tin hữu ích từ dữ liệu đã có, từ đó hỗ trợ ra các quyết định có lợi cho tương lai. Trong bài báo này, nhóm tác giả tập trung vào vấn đề phân lớp khách hàng, từ đó hỗ trợ tìm ra nhóm khách hàng tiềm năng bằng phương pháp cây quyết định Decision Tree J48, Naïve Bayes Classification và rừng ngẫu nhiên Random Forest. Kết quả cho thấy, mô hình dựa trên thuật toán cây quyết định cho độ chính xác cao nhất, có tính khả thi cao trong việc phân lớp dự đoán hành vi khách hàng. Kết quả này được kỳ vọng sẽ là gợi ý hiệu quả về một hướng tiếp cận cho các nhà phân tích khách hàng trong việc tìm ra nhóm khách hàng tiềm năng thuộc lĩnh vực ngân hàng.


Sign in / Sign up

Export Citation Format

Share Document