A Cost-Sensitive Loss Function for Machine Learning

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18147534 ◽

2021 ◽

Vol 18 (14) ◽

pp. 7534

Author(s):

Ke Wang ◽

Qingwen Xue ◽

Jian John Lu

Keyword(s):

Machine Learning ◽

High Risk ◽

Loss Function ◽

Class Imbalance ◽

Support Vector ◽

Trajectory Data ◽

Recognition Model ◽

Learning Framework ◽

Sampling Cost ◽

Automated Machine Learning

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Download Full-text

Using machine-learning methods to analyze economic loss function of quality management processes

Journal of Physics Conference Series ◽

10.1088/1742-6596/1015/3/032031 ◽

2018 ◽

Vol 1015 ◽

pp. 032031

Author(s):

V A Dzedik ◽

P A Lontsikh

Keyword(s):

Machine Learning ◽

Quality Management ◽

Loss Function ◽

Economic Loss ◽

Learning Methods ◽

Machine Learning Methods ◽

Management Processes

Download Full-text

Building an NCAA men’s basketball predictive model and quantifying its success

Journal of Quantitative Analysis in Sports ◽

10.1515/jqas-2014-0058 ◽

2015 ◽

Vol 11 (1) ◽

Cited By ~ 4

Author(s):

Michael J. Lopez ◽

Gregory J. Matthews

Keyword(s):

Machine Learning ◽

Prediction Model ◽

Predictive Model ◽

Loss Function ◽

Las Vegas ◽

Statistical Tools ◽

Men's Basketball ◽

Predictive Algorithms ◽

Complex Models ◽

Logistic Regressions

AbstractComputing and machine learning advancements have led to the creation of many cutting-edge predictive algorithms, some of which have been demonstrated to provide more accurate forecasts than traditional statistical tools. In this manuscript, we provide evidence that the combination of modest statistical methods with informative data can meet or exceed the accuracy of more complex models when it comes to predicting the NCAA men’s basketball tournament. First, we describe a prediction model that merges the point spreads set by Las Vegas sportsbooks with possession based team efficiency metrics by using logistic regressions. The set of probabilities generated from this model most accurately predicted the 2014 tournament, relative to approximately 400 competing submissions, as judged by the log loss function. Next, we attempt to quantify the degree to which luck played a role in the success of this model by simulating tournament outcomes under different sets of true underlying game probabilities. We estimate that under the most optimistic of game probability scenarios, our entry had roughly a 12% chance of outscoring all competing submissions and just less than a 50% chance of finishing with one of the ten best scores.

Download Full-text

Quantifying the Impacts of Pre- and Post-Conception TSH Levels on Birth Outcomes: An Examination of Different Machine Learning Models

Frontiers in Endocrinology ◽

10.3389/fendo.2021.755364 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yuantong Sun ◽

Weiwei Zheng ◽

Ling Zhang ◽

Huijuan Zhao ◽

Xun Li ◽

...

Keyword(s):

Machine Learning ◽

Birth Outcomes ◽

Loss Function ◽

Apgar Score ◽

Thyroid Stimulating Hormone ◽

Adverse Birth Outcomes ◽

Learning Models ◽

Neural Network Models ◽

Tsh Levels ◽

Machine Learning Models

BackgroundWhile previous studies identified risk factors for diverse pregnancy outcomes, traditional statistical methods had limited ability to quantify their impacts on birth outcomes precisely. We aimed to use a novel approach that applied different machine learning models to not only predict birth outcomes but systematically quantify the impacts of pre- and post-conception serum thyroid-stimulating hormone (TSH) levels and other predictive characteristics on birth outcomes.MethodsWe used data from women who gave birth in Shanghai First Maternal and Infant Hospital from 2014 to 2015. We included 14,110 women with the measurement of preconception TSH in the first analysis and 3,428 out of 14,110 women with both pre- and post-conception TSH measurement in the second analysis. Synthetic Minority Over-sampling Technique (SMOTE) was applied to adjust the imbalance of outcomes. We randomly split (7:3) the data into a training set and a test set in both analyses. We compared Area Under Curve (AUC) for dichotomous outcomes and macro F1 score for categorical outcomes among four machine learning models, including logistic model, random forest model, XGBoost model, and multilayer neural network models to assess model performance. The model with the highest AUC or macro F1 score was used to quantify the importance of predictive features for adverse birth outcomes with the loss function algorithm.ResultsThe XGBoost model provided prominent advantages in terms of improved performance and prediction of polytomous variables. Predictive models with abnormal preconception TSH or not-well-controlled TSH, a novel indicator with pre- and post-conception TSH levels combined, provided the similar robust prediction for birth outcomes. The highest AUC of 98.7% happened in XGBoost model for predicting low Apgar score with not-well-controlled TSH adjusted. By loss function algorithm, we found that not-well-controlled TSH ranked 4th, 6th, and 7th among 14 features, respectively, in predicting birthweight, induction, and preterm birth, and 3rd among 19 features in predicting low Apgar score.ConclusionsOur four machine learning models offered valid predictions of birth outcomes in women during pre- and post-conception. The predictive features panel suggested the combined TSH indicator (not-well-controlled TSH) could be a potentially competitive biomarker to predict adverse birth outcomes.

Download Full-text

Machine Learning Approaches to Predict Peak Demand Days of Cardiovascular Admissions Considering Environmental Exposure

10.21203/rs.2.19636/v3 ◽

2020 ◽

Author(s):

Hang Qiu ◽

Lin Luo ◽

Ziqi Su ◽

Li Zhou ◽

Liya Wang ◽

...

Keyword(s):

Machine Learning ◽

Environmental Exposure ◽

Loss Function ◽

Ambient Air ◽

Quality Data ◽

Gradient Boosting ◽

Learning Approaches ◽

Learning Models ◽

Peak Demand ◽

Logarithmic Loss

Abstract Background: Accumulating evidence has linked environmental exposure, such as ambient air pollution and meteorological factors, to the development and severity of cardiovascular diseases (CVDs), resulting in increased healthcare demand. Effective prediction of demand for healthcare services, particularly those associated with peak events of CVDs, can be useful in optimizing the allocation of medical resources. However, few studies have attempted to adopt machine learning approaches with excellent predictive abilities to forecast the healthcare demand for CVDs. This study aims to develop and compare several machine learning models in predicting the peak demand days of CVDs admissions using the hospital admissions data, air quality data and meteorological data in Chengdu, China from 2015 to 2017.Methods: Six machine learning algorithms, including logistic regression (LR), support vector machine (SVM), artificial neural network (ANN), random forest (RF), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) were applied to build the predictive models with a unique feature set. The area under a receiver operating characteristic curve (AUC), logarithmic loss function, accuracy, sensitivity, specificity, precision, and F1 score were used to evaluate the predictive performances of the six models.Results: The LightGBM model exhibited the highest AUC (0.940, 95% CI: 0.900-0.980), which was significantly higher than that of LR (0.842, 95% CI: 0.783-0.901), SVM (0.834, 95% CI: 0.774-0.894) and ANN (0.890, 95% CI: 0.836-0.944), but did not differ significantly from that of RF (0.926, 95% CI: 0.879-0.974) and XGBoost (0.930, 95% CI: 0.878-0.982). In addition, the LightGBM has the optimal logarithmic loss function (0.218), accuracy (91.3%), specificity (94.1%), precision (0.695), and F1 score (0.725). Feature importance identification indicated that the contribution rate of meteorological conditions and air pollutants for the prediction was 32% and 43%, respectively.Conclusion: This study suggests that ensemble learning models, especially the LightGBM model, can be used to effectively predict the peak events of CVDs admissions, and therefore could be a very useful decision-making tool for medical resource management.

Download Full-text

UNIVERSAL ALGORITHMS FOR PROBABILITY FORECASTING

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213012400155 ◽

2012 ◽

Vol 21 (04) ◽

pp. 1240015

Author(s):

FEDOR ZHDANOV ◽

YURI KALNISHKAN

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Linear Model ◽

Loss Function ◽

Efficient Algorithms ◽

Classification Problems ◽

Computationally Efficient ◽

Multi Class Classification ◽

Computationally Efficient Algorithms

Multi-class classification is one of the most important tasks in machine learning. In this paper we consider two online multi-class classification problems: classification by a linear model and by a kernelized model. The quality of predictions is measured by the Brier loss function. We obtain two computationally efficient algorithms for these problems by applying the Aggregating Algorithms to certain pools of experts and prove theoretical guarantees on the losses of these algorithms. We kernelize one of the algorithms and prove theoretical guarantees on its loss. We perform experiments and compare our algorithms with logistic regression.

Download Full-text

Модель та метод навчання для класифікаційного аналізу рівня води в стічних трубах за даними відео інспекції

RADIOELECTRONIC AND COMPUTER SYSTEMS ◽

10.32620/reks.2021.2.01 ◽

2021 ◽

pp. 4-15

Author(s):

В’ячеслав Васильович Москаленко ◽

Микола Олександрович Зарецький ◽

Артем Геннадійович Коробов ◽

Ярослав Юрійович Ковальський ◽

Артур Фанісович Шаєхов ◽

...

Keyword(s):

Machine Learning ◽

Water Level ◽

Loss Function ◽

Binary Code ◽

Binary Representation ◽

Reference Vector ◽

Temporal Network ◽

Convolutional Network ◽

Classification Analysis ◽

And Training

Models and training methods for water-level classification analysis on the footage of sewage pipe inspections have been developed and investigated. The object of the research is the process of water-level recognition, considering the spatial and temporal context during the inspection of sewage pipes. The subject of the research is a model and machine learning method for water-level classification analysis on video sequences of pipe inspections under conditions of limited size and an unbalanced set of training data. A four-stage algorithm for training the classifier is proposed. At the first stage of training, training occurs with a softmax triplet loss function and a regularizing component to penalize the rounding error of the network output to a binary code. The next step is to define a binary code (reference vector) for each class according to the principles of error-correcting output codes, but considering the intraclass and interclass relations. The computed reference vector of each class is used as the target label of the sample for further training using the joint cross-entropy loss function. The last stage of machine learning involves optimizing the parameters of the decision rules based on the information criterion to account for the boundaries of deviation of the binary representation of the observations of each class from the corresponding reference vectors. As a classifier model, a combination of 2D convolutional feature extractor for each frame and temporal network to analyze inter-frame dependencies is considered. The different variants of the temporal network are compared. We consider a 1D regular convolutional network with dilated convolutions, 1D causal convolutional network with dilated convolutions, recurrent LSTM-network, recurrent GRU-network. The performance of the models is compared by the micro-averaged metric F1 computed on the test subset. The results obtained on the dataset from Ace Pipe Cleaning (Kansas City, USA) confirm the suitability of the model and training method for practical use, the obtained value of F1-metric is 0.88. The results of training by the proposed method were compared with the results obtained using the traditional method. It was shown that the proposed method provides a 9 % increase in the value of micro-averaged F1-measure.

Download Full-text

Online Active Learning of Reject Option Classifiers

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6019 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5652-5659

Author(s):

Kulin Shah ◽

Naresh Manwani

Keyword(s):

Machine Learning ◽

Active Learning ◽

Supervised Learning ◽

Loss Function ◽

Learning Algorithm ◽

Binary Classification ◽

Experimental Results ◽

Reject Option ◽

Novel Algorithms ◽

Ramp Loss Function

Active learning is an important technique to reduce the number of labeled examples in supervised learning. Active learning for binary classification has been well addressed in machine learning. However, active learning of the reject option classifier remains unaddressed. In this paper, we propose novel algorithms for active learning of reject option classifiers. We develop an active learning algorithm using double ramp loss function. We provide mistake bounds for this algorithm. We also propose a new loss function called double sigmoid loss function for reject option and corresponding active learning algorithm. We offer a convergence guarantee for this algorithm. We provide extensive experimental results to show the effectiveness of the proposed algorithms. The proposed algorithms efficiently reduce the number of label examples required.

Download Full-text

МОДЕЛЬ І МЕТОД НАВЧАННЯ КЛАСИФІКАТОРА КОНТЕКСТІВ СПОСТЕРЕЖЕННЯ НА ЗОБРАЖЕННЯХ ВІДЕОІНСПЕКЦІЇ СТІЧНИХ ТРУБ

RADIOELECTRONIC AND COMPUTER SYSTEMS ◽

10.32620/reks.2020.3.06 ◽

2020 ◽

pp. 59-66

Author(s):

В’ячеслав Васильович Москаленко ◽

Микола Олександрович Зарецький ◽

Ярослав Юрійович Ковальський ◽

Сергій Сергійович Мартиненко

Keyword(s):

Machine Learning ◽

Loss Function ◽

Kansas City ◽

Data Augmentation ◽

Decision Rules ◽

Main Idea ◽

Test Sample ◽

Sewer Pipe ◽

Output Layer ◽

And Training

Video inspection is often used to diagnose sewer pipe defects. To correctly encode founded defects according to existing standards, it is necessary to consider a lot of contextual information about the orientation and location of the camera from sewer pipe video inspection. A model for the classification of context on frames during observations in the video inspection of sewer pipes and a five-stage method of machine learning is proposed. The main idea of the proposed approach is to combine the methods of deep machine learning with the principles of information maximization and coding with self-correcting Hamming codes. The proposed model consists of a deep convolutional neural network with a sigmoid layer followed by the rounding output layer and information-extreme decision rules. The first stages of the method are data augmentation and training of the feature extractor in the Siamese model with softmax triplet loss function. The next steps involve calculating a binary code for each class of recognition that is used as a label in learning with a binary cross-entropy loss function to increase the compactness of the distribution of each class's observations in the Hamming binary space. At the last stage of the training method, it is supposed to optimize the parameters of radial-basis decision rules in the Hamming space for each class according to the existing information-extreme criterion. The information criterion, expressed as a logarithmic function of the accuracy characteristics of the decision rules, provides the maximum generalization and reliability of the model under the most difficult conditions in the statistical sense. The effectiveness of this approach was tested on data provided by Ace Pipe Cleaning (Kansas City, USA) and MPWiK (Wroclaw, Poland) by comparing learning results according to the proposed and traditional models and training schemes. The obtained model of the image frame classifier provides acceptable for practical use classification accuracy on the test sample, which is 96.8 % and exceeds the result of the traditional scheme of training with the softmax output layer by 6.8 %.

Download Full-text

Optimizing Fractional Compositions To Achieve Extraordinary Properties

10.26434/chemrxiv.14680578 ◽

2021 ◽

Author(s):

Andrew Falkowski ◽

Steven Kauwe ◽

Taylor Sparks

Keyword(s):

Machine Learning ◽

Loss Function ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Inverse Design ◽

Data Driven ◽

Model Parameters ◽

Chemical Systems ◽

Target Property ◽

Successful Candidate

Traditional, data-driven materials discovery involves screening chemical systems with machine learning algorithms and selecting candidates that excel in a target property. The number of screening candidates grows infinitely large as the fractional resolution of compositions the number of included elements increases. The computational infeasibility and probability of overlooking a successful candidate grow likewise. Our approach shifts the optimization focus from model parameters to the fractions of each element in a composition. Using a pretrained network, CrabNet, and writing a custom loss function to govern a vector of element fractions, compositions can be optimized such that a predicted property is maximized or minimized. Single and multi-property optimization examples are presented that highlight the capabilities and robustness of this approach to inverse design.

Download Full-text