scholarly journals Prediction of secondary testosterone deficiency using machine learning: A comparative analysis of ensemble and base classifiers, probability calibration, and sampling strategies in a slightly imbalanced dataset

2021 ◽  
Vol 23 ◽  
pp. 100538
Author(s):  
Monique Tonani Novaes ◽  
Osmar Luiz Ferreira de Carvalho ◽  
Pedro Henrique Guimarães Ferreira ◽  
Taciana Leonel Nunes Tiraboschi ◽  
Caroline Santos Silva ◽  
...  
Author(s):  
Ritu Khandelwal ◽  
Hemlata Goyal ◽  
Rajveer Singh Shekhawat

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.


2021 ◽  
Author(s):  
Vu-Linh Nguyen ◽  
Mohammad Hossein Shaker ◽  
Eyke Hüllermeier

AbstractVarious strategies for active learning have been proposed in the machine learning literature. In uncertainty sampling, which is among the most popular approaches, the active learner sequentially queries the label of those instances for which its current prediction is maximally uncertain. The predictions as well as the measures used to quantify the degree of uncertainty, such as entropy, are traditionally of a probabilistic nature. Yet, alternative approaches to capturing uncertainty in machine learning, alongside with corresponding uncertainty measures, have been proposed in recent years. In particular, some of these measures seek to distinguish different sources and to separate different types of uncertainty, such as the reducible (epistemic) and the irreducible (aleatoric) part of the total uncertainty in a prediction. The goal of this paper is to elaborate on the usefulness of such measures for uncertainty sampling, and to compare their performance in active learning. To this end, we instantiate uncertainty sampling with different measures, analyze the properties of the sampling strategies thus obtained, and compare them in an experimental study.


Author(s):  
Mangena Venu Madhavan ◽  
Sagar Pande ◽  
Pooja Umekar ◽  
Tushar Mahore ◽  
Dhiraj Kalyankar

Author(s):  
Mehmet Şahin ◽  
Murat Uçar

In this study, a comparative analysis for predicting sports attendance demand is presented based on econometric, artificial intelligence, and machine learning methodologies. Data from more than 20,000 games from three major leagues, namely the National Basketball Association (NBA), National Football League (NFL), and Major League Baseball (MLB), were used for training and testing the approaches. The relevant literature was examined to determine the most useful variables as potential regressors in forecasting. To reveal the most effective approach, three scenarios containing seven cases were constructed. In the first scenario, each league was evaluated separately. In the second scenario, the three possible combinations of league pairings were evaluated, while in the third scenario, all three leagues were evaluated together. The performance evaluations of the results suggest that one of the machine learning methods, Gradient Boosting, outperformed the other methods used. However, the Artificial Neural Network, deep Convolutional Neural Network, and Decision Trees also provided productive and competitive predictions for sports games. Based on the results, the predictions for the NBA and NFL leagues are more satisfactory than the predictions of the MLB, which may be caused by the structure of the MLB. The results of the sensitivity analysis indicate that the performance of the home team is the most influential factor for all three leagues.


Sign in / Sign up

Export Citation Format

Share Document