scholarly journals Online Active Learning of Reject Option Classifiers

2020 ◽  
Vol 34 (04) ◽  
pp. 5652-5659
Author(s):  
Kulin Shah ◽  
Naresh Manwani

Active learning is an important technique to reduce the number of labeled examples in supervised learning. Active learning for binary classification has been well addressed in machine learning. However, active learning of the reject option classifier remains unaddressed. In this paper, we propose novel algorithms for active learning of reject option classifiers. We develop an active learning algorithm using double ramp loss function. We provide mistake bounds for this algorithm. We also propose a new loss function called double sigmoid loss function for reject option and corresponding active learning algorithm. We offer a convergence guarantee for this algorithm. We provide extensive experimental results to show the effectiveness of the proposed algorithms. The proposed algorithms efficiently reduce the number of label examples required.

2020 ◽  
Vol 34 (04) ◽  
pp. 3537-3544
Author(s):  
Xu Chen ◽  
Brett Wujek

Automated machine learning (AutoML) strives to establish an appropriate machine learning model for any dataset automatically with minimal human intervention. Although extensive research has been conducted on AutoML, most of it has focused on supervised learning. Research of automated semi-supervised learning and active learning algorithms is still limited. Implementation becomes more challenging when the algorithm is designed for a distributed computing environment. With this as motivation, we propose a novel automated learning system for distributed active learning (AutoDAL) to address these challenges. First, automated graph-based semi-supervised learning is conducted by aggregating the proposed cost functions from different compute nodes in a distributed manner. Subsequently, automated active learning is addressed by jointly optimizing hyperparameters in both the classification and query selection stages leveraging the graph loss minimization and entropy regularization. Moreover, we propose an efficient distributed active learning algorithm which is scalable for big data by first partitioning the unlabeled data and replicating the labeled data to different worker nodes in the classification stage, and then aggregating the data in the controller in the query selection stage. The proposed AutoDAL algorithm is applied to multiple benchmark datasets and a real-world electrocardiogram (ECG) dataset for classification. We demonstrate that the proposed AutoDAL algorithm is capable of achieving significantly better performance compared to several state-of-the-art AutoML approaches and active learning algorithms.


2022 ◽  
Vol 11 (1) ◽  
pp. 325-337
Author(s):  
Natalia Gil ◽  
Marcelo Albuquerque ◽  
Gabriela de

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>


2021 ◽  
Author(s):  
Gábor Csizmadia ◽  
Krisztina Liszkai-Peres ◽  
Bence Ferdinandy ◽  
Ádám Miklósi ◽  
Veronika Konok

Abstract Human activity recognition (HAR) using machine learning (ML) methods is a relatively new method for collecting and analyzing large amounts of human behavioral data using special wearable sensors. Our main goal was to find a reliable method which could automatically detect various playful and daily routine activities in children. We defined 40 activities for ML recognition, and we collected activity motion data by means of wearable smartwatches with a special SensKid software. We analyzed the data of 34 children (19 girls, 15 boys; age range: 6.59 – 8.38; median age = 7.47). All children were typically developing first graders from three elementary schools. The activity recognition was a binary classification task which was evaluated with a Light Gradient Boosted Machine (LGBM)learning algorithm, a decision based method with a 3-fold cross validation. We used the sliding window technique during the signal processing, and we aimed at finding the best window size for the analysis of each behavior element to achieve the most effective settings. Seventeen activities out of 40 were successfully recognized with AUC values above 0.8. The window size had no significant effect. The overall accuracy was 0.95, which is at the top segment of the previously published similar HAR data. In summary, the LGBM is a very promising solution for HAR. In line with previous findings, our results provide a firm basis for a more precise and effective recognition system that can make human behavioral analysis faster and more objective.


2019 ◽  
Vol 10 (35) ◽  
pp. 8154-8163 ◽  
Author(s):  
Yao Zhang ◽  
Alpha A. Lee

We report a statistically principled method to quantify the uncertainty of machine learning models for molecular properties prediction. We show that this uncertainty estimate can be used to judiciously design experiments.


Author(s):  
Mojtaba Montazery ◽  
Nic Wilson

Support Vector Machines (SVM) are among the most well-known machine learning methods, with broad use in different scientific areas. However, one necessary pre-processing phase for SVM is normalization (scaling) of features, since SVM is not invariant to the scales of the features’ spaces, i.e., different ways of scaling may lead to different results. We define a more robust decision-making approach for binary classification, in which one sample strongly belongs to a class if it belongs to that class for all possible rescalings of features. We derive a way of characterising the approach for binary SVM that allows determining when an instance strongly belongs to a class and when the classification is invariant to rescaling. The characterisation leads to a computation method to determine whether one sample is strongly positive, strongly negative or neither. Our experimental results back up the intuition that being strongly positive suggests stronger confidence that an instance really is positive.


Now days, Machine learning is considered as the key technique in the field of technologies, such as, Internet of things (IOT), Cloud computing, Big data and Artificial Intelligence etc. As technology enhances, lots of incorrect and redundant data are collected from these fields. To make use of these data for a meaningful purpose, we have to apply mining or classification technique in the real world. In this paper, we have proposed two nobel approaches towards data classification by using supervised learning algorithm


Human voice recognition by computers has been ever developing area since 1952. It is challenging task for a computer to understand and act according to human voice rather than to commands or programs. The reason is that no two human’s voice or style or pitch will be similar and every word is not pronounced by everyone in a similar fashion. Background noises and disturbances may confuse the system. The voice or accent of the same person may change according to the user’s mood, situation, time etc. despite of all these challenges, voice recognition and speech to text conversion has reached a successful stage. Voice processing technology deserves still more research. As a tip of iceberg of this research we contribute our work on this are and we propose a new method i.e., VRSML (Voice Recognition System through Machine Learning) mainly focuses on Speech to text conversion, then analyzing the text extracted from speech in the form of tokens through Machine Learning. After analyzing the derived text, reports are created in textual as well graphical format to represent the vocabulary levels used in that speech. As Supervised learning algorithm from Machine Learning is employed to classify the tokens derived from text, the reports will be more accurate and will be generated faster.


Sign in / Sign up

Export Citation Format

Share Document