scholarly journals Leading indicators and maritime safety: predicting future risk with a machine learning approach

2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Lutz Kretschmann

Abstract The shipping industry has been quite successful in reducing the number of major accidents in the past. In order to continue this development in the future, innovative leading risk indicators can make a significant contribution. If designed properly, they enable a forward-looking identification and assessment of existing risks for ship and crew, which in turn allows the implementation of mitigating measures before adverse events occur. Right now, the opportunity for developing such leading risk indicators is positively influenced by the ongoing digital transformation in the maritime industry. With an increasing amount of data from ship operation becoming available, these can be exploited in innovative risk management solutions. By combining the idea of leading risk indicators with data and algorithm-based risk management methods, this paper firstly establishes a development framework for designing maritime risk models based on safety-related data collected onboard. Secondly, the development framework is applied in a proof of concept where an innovative machine learning-based approach is used to calculate a leading maritime risk indicator. Overall, findings confirm that a data- and algorithm-based approach can be used to determine a leading risk indicator per ship, even though the achieved model performance is not yet regarded as satisfactory and further research is planned.

2021 ◽  
Vol 2115 (1) ◽  
pp. 012042
Author(s):  
S Premanand ◽  
Sathiya Narayanan

Abstract The primary objective of this particular paper is to classify the health-related data without feature extraction in Machine Learning, which hinder the performance and reliability. The assumption of our work will be like, can we able to get better result for health-related data with the help of Tree based Machine Learning algorithms without extracting features like in Deep Learning. This study performs better classification with Tree based Machine Learning approach for the health-related medical data. After doing pre-processing, without feature extraction, i.e., from raw data signal with the help of Machine Learning algorithms we are able to get better results. The presented paper which has better result even when compared to some of the advanced Deep Learning architecture models. The results demonstrate that overall classification accuracy of Random Forest, XGBoost, LightGBM and CatBoost, Tree-based Machine Learning algorithms for normal and abnormal condition of the datasets was found to be 97.88%, 98.23%, 98.03% and 95.57% respectively.


2017 ◽  
Author(s):  
Benjamin Goudey ◽  
Bowen J Fung ◽  
Christine Schieber ◽  
Noel G Faux ◽  
◽  
...  

ABSTRACTIt is increasingly recognized that Alzheimer’s disease (AD) exists before dementia is present and that shifts in amyloid beta occur long before clinical symptoms can be detected. Early detection of these molecular changes is a key aspect for the success of interventions aimed at slowing down rates of cognitive decline. Recent evidence indicates that of the two established methods for measuring amyloid, a decrease in cerebral spinal fluid (CSF) amyloid β1−42 (Aβ1−42) may be an earlier indicator of Alzheimer’s disease risk than measures of amyloid obtained from Positron Emission Topography (PET). However, CSF collection is highly invasive and expensive. In contrast, blood collection is routinely performed, minimally invasive and cheap. In this work, we develop a blood-based signature that can provide a cheap and minimally invasive estimation of an individual’s CSF amyloid status using a machine learning approach. We show that a Random Forest model derived from plasma analytes can accurately predict subjects as having abnormal (low) CSF Aβ1−42 levels indicative of AD risk (0.84 AUC, 0.78 sensitivity, and 0.73 specificity). Refinement of the modeling indicates that only APOEε4 carrier status and four analytes are required to achieve a high level of accuracy. Furthermore, we show across an independent validation cohort that individuals with predicted abnormal CSF Aβ1−42 levels transitioned to an AD diagnosis over 120 months significantly faster than those predicted with normal CSF Aβ1−42 levels and that the resulting model also performs reasonably across PET Aβ1−42 status.This is the first study to show that a machine learning approach, using plasma protein levels, age and APOEε4 carrier status, is able to predict CSF Aβ1-42 status, the earliest risk indicator for AD, with high accuracy.


2021 ◽  
Vol 50 (Supplement_1) ◽  
Author(s):  
Sitwat Ali

Abstract Background Administrative health datasets are widely used in public health research but often lack information about common confounders. We aimed to develop and validate machine learning (ML)-based models using medication data from Australia’s Pharmaceutical Benefits Scheme (PBS) database to predict obesity and smoking. Methods We used data from the D-Health Trial (N = 18,000) and the QSkin Study (N = 43,794). Smoking history, and height and weight were self-reported at study entry. Linkage to the PBS dataset captured 5 years of medication data after cohort entry. We used age, sex, and medication use, classified using Anatomical Therapeutic Classification codes, as potential predictors of smoking and obesity. We trained gradient-boosted machine learning models using data for the first 80% of participants enrolled; models were validated using the remaining 20%. We assessed model performance overall and by sex and age, and compared models generated using 3 and 5 years of PBS data. Results Based on the validation dataset using 3 years of PBS data, the area under the receiver operating characteristic curve (AUC) was 0.70 (95% confidence interval (CI) 0.68 – 0.71) for predicting obesity and 0.71 (95% CI 0.70 – 0.72) for predicting smoking. Models performed better in women than in men. Using 5 years of PBS data resulted in marginal improvement. Conclusions Medication data in combination with age and sex can be used to predict obesity and smoking. These models may be of value to researchers using data collected for administrative purposes.


Forecasting ◽  
2021 ◽  
Vol 3 (3) ◽  
pp. 570-579
Author(s):  
Justin L. Wang ◽  
Hanqi Zhuang ◽  
Laurent Chérubin ◽  
Ali Muhamed Ali ◽  
Ali Ibrahim

A divide-and-conquer (DAC) machine learning approach was first proposed by Wang et al. to forecast the sea surface height (SSH) of the Loop Current System (LCS) in the Gulf of Mexico. In this DAC approach, the forecast domain was divided into non-overlapping partitions, each of which had their own prediction model. The full domain SSH prediction was recovered by interpolating the SSH across each partition boundaries. Although the original DAC model was able to predict the LCS evolution and eddy shedding more than two months and three months in advance, respectively, growing errors at the partition boundaries negatively affected the model forecasting skills. In the study herein, a new partitioning method, which consists of overlapping partitions is presented. The region of interest is divided into 50%-overlapping partitions. At each prediction step, the SSH value at each point is computed from overlapping partitions, which significantly reduces the occurrence of unrealistic SSH features at partition boundaries. This new approach led to a significant improvement of the overall model performance both in terms of features prediction such as the location of the LC eddy SSH contours but also in terms of event prediction, such as the LC ring separation. We observed an approximate 12% decrease in error over a 10-week prediction, and also show that this method can approximate the location and shedding of eddy Cameron better than the original DAC method.


2019 ◽  
Vol 18 (3) ◽  
pp. 100-124
Author(s):  
J. Fahey-Gilmour ◽  
B. Dawson ◽  
P. Peeling ◽  
J. Heasman ◽  
B. Rogalski

Abstract In Australian football (AF), few studies have assessed combinations of pre- game factors and their relation to game outcomes (win/loss) in multivariable analyses. Further, previous research has mostly been confined to association-based linear approaches and post-game prediction, with limited assessment of predictive machine learning (ML) models in a pre-game setting. Therefore, our aim was to use ML techniques to predict game outcomes and produce a hierarchy of important (win/loss) variables. A total of 152 variables (79 absolute and 73 differentials) were used from the 2013–2018 Australian Football League (AFL) seasons. Various ML models were trained (cross-validation) on the 2013–2017 seasons with the–2018 season used as an independent test set. Model performance varied (66.5-73.3% test set accuracy), although the best model (glmnet – 73.3%) rivalled bookmaker predictions in the same period (70.9%). The glmnet model revealed measures of team quality (a player-based rating and a team-based) in their relative form as the most important variables for prediction. Models that contained in-built feature selection or could model non-linear relationships generally performed better. These findings show that AFL game outcomes can be predicted using ML methods and provide a hierarchy of predictors that maximize the chance of winning.


2012 ◽  
Vol 9 (3) ◽  
pp. 172-185
Author(s):  
Jacobus Young

The use of key risk indicators as a management tool is one of the requirements for the calculation of a bank’s operational risk capital charge. This article provides insight into the use of key risk indicators as an operational risk management tool by South African banks and indicates their level of preparedness to comply with the criteria. The results of a questionnaire aimed at junior and middle management indicated that banks are not suitably prepared to implement a key risk indicator management process and have a general lack of understanding of the underlying theory and concept of the criteria to use key risk indicators. The advantages of using key risk indicators are not fully exploited and more benefits can be realised by raising awareness in this regard.


2021 ◽  
Vol 15 ◽  
Author(s):  
Matthew T. Prelich ◽  
Mona Matar ◽  
Suleyman A. Gokoglu ◽  
Christopher A. Gallo ◽  
Alexander Schepelmann ◽  
...  

This study presents a data-driven machine learning approach to predict individual Galactic Cosmic Radiation (GCR) ion exposure for 4He, 16O, 28Si, 48Ti, or 56Fe up to 150 mGy, based on Attentional Set-shifting (ATSET) experimental tests. The ATSET assay consists of a series of cognitive performance tasks on irradiated male Wistar rats. The GCR ion doses represent the expected cumulative radiation astronauts may receive during a Mars mission on an individual ion basis. The primary objective is to synthesize and assess predictive models on a per-subject level through Machine Learning (ML) classifiers. The raw cognitive performance data from individual rodent subjects are used as features to train the models and to explore the capabilities of three different ML techniques for elucidating a range of correlations between received radiation on rodents and their performance outcomes. The analysis employs scores of selected input features and different normalization approaches which yield varying degrees of model performance. The current study shows that support vector machine, Gaussian naive Bayes, and random forest models are capable of predicting individual ion exposure using ATSET scores where corresponding Matthews correlation coefficients and F1 scores reflect model performance exceeding random chance. The study suggests a decremental effect on cognitive performance in rodents due to ≤150 mGy of single ion exposure, inasmuch as the models can discriminate between 0 mGy and any exposure level in the performance score feature space. A number of observations about the utility and limitations in specific normalization routines and evaluation scores are examined as well as best practices for ML with imbalanced datasets observed.


Sign in / Sign up

Export Citation Format

Share Document