scholarly journals Data-Driven Approach for Predicting and Explaining the Risk of Long-Term Unemployment

2020 ◽  
Vol 214 ◽  
pp. 01023
Author(s):  
Linan (Frank) Zhao

Long-term unemployment has significant societal impact and is of particular concerns for policymakers with regard to economic growth and public finances. This paper constructs advanced ensemble machine learning models to predict citizens’ risks of becoming long-term unemployed using data collected from European public authorities for employment service. The proposed model achieves 81.2% accuracy on identifying citizens with high risks of long-term unemployment. This paper also examines how to dissect black-box machine learning models by offering explanations at both a local and global level using SHAP, a state-of-the-art model-agnostic approach to explain factors that contribute to long-term unemployment. Lastly, this paper addresses an under-explored question when applying machine learning in the public domain, that is, the inherent bias in model predictions. The results show that popular models such as gradient boosted trees may produce unfair predictions against senior age groups and immigrants. Overall, this paper sheds light on the recent increasing shift for governments to adopt machine learning models to profile and prioritize employment resources to reduce the detrimental effects of long-term unemployment and improve public welfare.

2021 ◽  
Author(s):  
Yongmin Cho ◽  
Rachael A Jonas-Closs ◽  
Lev Y Yampolsky ◽  
Marc W Kirschner ◽  
Leonid Peshkin

We present a novel platform for testing the effect of interventions on life- and health-span of a short-lived semi transparent freshwater organism, sensitive to drugs with complex behavior and physiology - the planktonic crustacean Daphnia magna. Within this platform, dozens of complex behavioural features of both routine motion and response to stimuli are continuously accurately quantified for large homogeneous cohorts via an automated phenotyping pipeline. We build predictive machine learning models calibrated using chronological age and extrapolate onto phenotypic age. We further apply the model to estimate the phenotypic age under pharmacological perturbation. Our platform provides a scalable framework for drug screening and characterization in both life-long and instant assays as illustrated using long term dose response profile of metformin and short term assay of such well-studied substances as caffeine and alcohol.


2020 ◽  
Vol 7 (2) ◽  
pp. 55
Author(s):  
Yasir Suhail ◽  
Madhur Upadhyay ◽  
Aditya Chhibber ◽  
Kshitiz

Extraction of teeth is an important treatment decision in orthodontic practice. An expert system that is able to arrive at suitable treatment decisions can be valuable to clinicians for verifying treatment plans, minimizing human error, training orthodontists, and improving reliability. In this work, we train a number of machine learning models for this prediction task using data for 287 patients, evaluated independently by five different orthodontists. We demonstrate why ensemble methods are particularly suited for this task. We evaluate the performance of the machine learning models and interpret the training behavior. We show that the results for our model are close to the level of agreement between different orthodontists.


10.29007/mbb7 ◽  
2020 ◽  
Author(s):  
Maher Selim ◽  
Ryan Zhou ◽  
Wenying Feng ◽  
Omar Alam

Many statistical and machine learning models for prediction make use of historical data as an input and produce single or small numbers of output values. To forecast over many timesteps, it is necessary to run the program recursively. This leads to a compounding of errors, which has adverse effects on accuracy for long forecast periods. In this paper, we show this can be mitigated through the addition of generating features which can have an “anchoring” effect on recurrent forecasts, limiting the amount of compounded error in the long term. This is studied experimentally on a benchmark energy dataset using two machine learning models LSTM and XGBoost. Prediction accuracy over differing forecast lengths is compared using the forecasting MAPE. It is found that for LSTM model the accuracy of short term energy forecasting by using a past energy consumption value as a feature is higher than the accuracy when not using past values as a feature. The opposite behavior takes place for the long term energy forecasting. For the XGBoost model, the accuracy for both short and long term energy forecasting is higher when not using past values as a feature.


2022 ◽  
Vol 8 ◽  
pp. 612-618
Author(s):  
Pavel Matrenin ◽  
Murodbek Safaraliev ◽  
Stepan Dmitriev ◽  
Sergey Kokin ◽  
Anvari Ghulomzoda ◽  
...  

2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. 784-785
Author(s):  
Mamoun Mardini ◽  
Chen Bai ◽  
Amal Wanigatunga ◽  
Santiago Saldana ◽  
Ramon Casanova ◽  
...  

Abstract Regular and sufficient amounts of physical activity (PA) are significant in increasing health benefits and mitigating health risks. Given the growing popularity of wrist-worn devices across all age groups, a rigorous evaluation for recognizing hallmark measures of physical activities and estimating energy expenditure is needed to compare their accuracy across the lifespan. The goal of the study was to build machine learning models to recognizing the hallmark measures of PA and estimating energy expenditure (EE), and to test the hypothesis that model performance varies across age-group: young [20-50 years], middle (50-70 years], and old (70-89 years]. Participants (n = 253, 62% women, aged 20-89 years old) performed a battery of 33 daily activities in a standardized laboratory setting while wearing a portable metabolic unit to measure EE that was used to gauge metabolic intensity. Participants also wore a Tri-axial accelerometer on the right wrist. Results from random forests algorithm were quite accurate at recognizing PA type; the F1-Score range across age groups was: sedentary [0.955 – 0.973], locomotion [0.942 – 0.964], and lifestyle [0.913 – 0.949]. Recognizing PA intensity resulted in lower performance; the F1-Score range across age groups was: sedentary [0.919 – 0.947], light [0.813 – 0.828], and moderate [0.846–0.875]. The root mean square error range was [0.835–1.009] for the estimation of EE. The F1-Score range for recognizing individual PAs was [0.263–0.784]. In conclusion, machine learning models used to represent accelerometry data are robust to age differences and a generalizable approach might be sufficient to utilize in accelerometer-based wearables.


2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Antonin Dauvin ◽  
Carolina Donado ◽  
Patrik Bachtiger ◽  
Ke-Chun Huang ◽  
Christopher Martin Sauer ◽  
...  

AbstractPatients admitted to the intensive care unit frequently have anemia and impaired renal function, but often lack historical blood results to contextualize the acuteness of these findings. Using data available within two hours of ICU admission, we developed machine learning models that accurately (AUC 0.86–0.89) classify an individual patient’s baseline hemoglobin and creatinine levels. Compared to assuming the baseline to be the same as the admission lab value, machine learning performed significantly better at classifying acute kidney injury regardless of initial creatinine value, and significantly better at predicting baseline hemoglobin value in patients with admission hemoglobin of <10 g/dl.


Sign in / Sign up

Export Citation Format

Share Document