Application of machine learning-based models to boost the predictive power of the SPAN index

Author(s):  
Chen-Chih Chung ◽  
Oluwaseun Adebayo Bamodu ◽  
Chien-Tai Hong ◽  
Lung Chan ◽  
Hung-Wen Chiu
2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Meisam Ghasedi ◽  
Maryam Sarfjoo ◽  
Iraj Bargegol

AbstractThe purpose of this study is to investigate and determine the factors affecting vehicle and pedestrian accidents taking place in the busiest suburban highway of Guilan Province located in the north of Iran and provide the most accurate prediction model. Therefore, the effective principal variables and the probability of occurrence of each category of crashes are analyzed and computed utilizing the factor analysis, logit, and Machine Learning approaches simultaneously. This method not only could contribute to achieving the most comprehensive and efficient model to specify the major contributing factor, but also it can provide officials with suggestions to take effective measures with higher precision to lessen accident impacts and improve road safety. Both the factor analysis and logit model show the significant roles of exceeding lawful speed, rainy weather and driver age (30–50) variables in the severity of vehicle accidents. On the other hand, the rainy weather and lighting condition variables as the most contributing factors in pedestrian accidents severity, underline the dominant role of environmental factors in the severity of all vehicle-pedestrian accidents. Moreover, considering both utilized methods, the machine-learning model has higher predictive power in all cases, especially in pedestrian accidents, with 41.6% increase in the predictive power of fatal accidents and 12.4% in whole accidents. Thus, the Artificial Neural Network model is chosen as the superior approach in predicting the number and severity of crashes. Besides, the good performance and validation of the machine learning is proved through performance and sensitivity analysis.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Hudson Fernandes Golino ◽  
Liliany Souza de Brito Amaral ◽  
Stenio Fernando Pimentel Duarte ◽  
Cristiano Mauro Assis Gomes ◽  
Telma de Jesus Soares ◽  
...  

The present study investigates the prediction of increased blood pressure by body mass index (BMI), waist (WC) and hip circumference (HC), and waist hip ratio (WHR) using a machine learning technique named classification tree. Data were collected from 400 college students (56.3% women) from 16 to 63 years old. Fifteen trees were calculated in the training group for each sex, using different numbers and combinations of predictors. The result shows that for women BMI, WC, and WHR are the combination that produces the best prediction, since it has the lowest deviance (87.42), misclassification (.19), and the higher pseudoR2(.43). This model presented a sensitivity of 80.86% and specificity of 81.22% in the training set and, respectively, 45.65% and 65.15% in the test sample. For men BMI, WC, HC, and WHC showed the best prediction with the lowest deviance (57.25), misclassification (.16), and the higher pseudoR2(.46). This model had a sensitivity of 72% and specificity of 86.25% in the training set and, respectively, 58.38% and 69.70% in the test set. Finally, the result from the classification tree analysis was compared with traditional logistic regression, indicating that the former outperformed the latter in terms of predictive power.


2017 ◽  
Vol 79 (02) ◽  
pp. 123-130 ◽  
Author(s):  
Whitney Muhlestein ◽  
Dallin Akagi ◽  
Justiss Kallos ◽  
Peter Morone ◽  
Kyle Weaver ◽  
...  

Objective Machine learning (ML) algorithms are powerful tools for predicting patient outcomes. This study pilots a novel approach to algorithm selection and model creation using prediction of discharge disposition following meningioma resection as a proof of concept. Materials and Methods A diversity of ML algorithms were trained on a single-institution database of meningioma patients to predict discharge disposition. Algorithms were ranked by predictive power and top performers were combined to create an ensemble model. The final ensemble was internally validated on never-before-seen data to demonstrate generalizability. The predictive power of the ensemble was compared with a logistic regression. Further analyses were performed to identify how important variables impact the ensemble. Results Our ensemble model predicted disposition significantly better than a logistic regression (area under the curve of 0.78 and 0.71, respectively, p = 0.01). Tumor size, presentation at the emergency department, body mass index, convexity location, and preoperative motor deficit most strongly influence the model, though the independent impact of individual variables is nuanced. Conclusion Using a novel ML technique, we built a guided ML ensemble model that predicts discharge destination following meningioma resection with greater predictive power than a logistic regression, and that provides greater clinical insight than a univariate analysis. These techniques can be extended to predict many other patient outcomes of interest.


2020 ◽  
Author(s):  
Emanuele Colonnelli ◽  
Jorge Gallego ◽  
Mounu Prem

The ability to predict corruption is crucial to policy. Using rich micro-data from Brazil, we show that multiple machine learning models display high levels of performance in predicting municipality-level corruption in public spending. We then quantify which individual municipality features and groups of similar characteristics have the highest predictive power. We find that measures of private sector activity, financial development, and human capital are the strongest predictors of corruption, while public sector and political features play a secondary role. Our findings have implications for the design and cost-effectiveness of various anti-corruption policies.


Risks ◽  
2019 ◽  
Vol 7 (3) ◽  
pp. 95 ◽  
Author(s):  
Jacky H. L. Poon

In actuarial modelling of risk pricing and loss reserving in general insurance, also known as P&C or non-life insurance, there is business value in the predictive power and automation through machine learning. However, interpretability can be critical, especially in explaining to key stakeholders and regulators. We present a granular machine learning model framework to jointly predict loss development and segment risk pricing. Generalising the Payments per Claim Incurred (PPCI) loss reserving method with risk variables and residual neural networks, this combines interpretable linear and sophisticated neural network components so that the ‘unexplainable’ component can be identified and regularised with a separate penalty. The model is tested for a real-life insurance dataset, and generally outperformed PPCI on predicting ultimate loss for sufficient sample size.


Author(s):  
David Easley ◽  
Marcos López de Prado ◽  
Maureen O’Hara ◽  
Zhibai Zhang

Abstract Understanding modern market microstructure phenomena requires large amounts of data and advanced mathematical tools. We demonstrate how machine learning can be applied to microstructural research. We find that microstructure measures continue to provide insights into the price process in current complex markets. Some microstructure features with high explanatory power exhibit low predictive power, while others with less explanatory power have more predictive power. We find that some microstructure-based measures are useful for out-of-sample prediction of various market statistics, leading to questions about market efficiency. We also show how microstructure measures can have important cross-asset effects. Our results are derived using 87 liquid futures contracts across all asset classes.


PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0258535
Author(s):  
Orion Weller ◽  
Luke Sagers ◽  
Carl Hanson ◽  
Michael Barnes ◽  
Quinn Snell ◽  
...  

Introduction Addressing the problem of suicidal thoughts and behavior (STB) in adolescents requires understanding the associated risk factors. While previous research has identified individual risk and protective factors associated with many adolescent social morbidities, modern machine learning approaches can help identify risk and protective factors that interact (group) to provide predictive power for STB. This study aims to develop a prediction algorithm for STB among adolescents using the risk and protective factor framework and social determinants of health. Methods The sample population consisted of more than 179,000 high school students living in Utah and participating in the Communities That Care (CTC) Youth Survey from 2011-2017. The dataset includes responses to 300+ questions from the CTC and 8000+ demographic factors from the American Census Survey for a total of 1.2 billion values. Machine learning techniques were employed to extract the survey questions that were best able to predict answers indicative of STB, using recent work in interpretable machine learning. Results Analysis showed strong predictive power, with the ability to predict individuals with STB with 91% accuracy. After extracting the top ten questions that most affected model predictions, questions fell into four main categories: familial life, drug consumption, demographics, and peer acceptance at school. Conclusions Modern machine learning approaches provide new methods for understanding the interaction between root causes and outcomes, such as STB. The model developed in this study showed significant improvement in predictive accuracy compared to previous research. Results indicate that certain risk and protective factors, such as adolescents being threatened or harassed through digital media or bullied at school, and exposure or involvement in serious arguments and yelling at home are the leading predictors of STB and can help narrow and reaffirm priority prevention programming and areas of focused policymaking.


2018 ◽  
Author(s):  
Adam Hakim ◽  
Shira Klorfeld ◽  
Tal Sela ◽  
Doron Friedman ◽  
Maytal Shabat-Simon ◽  
...  

AbstractA basic aim of marketing research is to predict consumers’ preferences and the success of marketing campaigns in the general population. However, traditional behavioral measurements have various limitations, calling for novel measurements to improve predictive power. In this study, we use neural signals measured with electroencephalography (EEG) in order to overcome these limitations. We record the EEG signals of subjects, as they watched commercials of six food products. We introduce a novel approach in which instead of using one type of EEG measure, we combine several measures, and use state-of-the-art machine learning algorithms to predict subjects’ individual future preferences over the products and the commercials’ population success, as measured by their YouTube metrics. As a benchmark, we acquired measurements of the commercials’ effectiveness using a standard questionnaire commonly used in marketing research. We reached 68.5% accuracy in predicting between the most and least preferred items and a lower than chance RMSE score for predicting the rank order preferences of all six products. We also predicted the commercials’ population success better than chance. Most importantly, we demonstrate for the first time, that for all of our predictions, the EEG measurements increased the prediction power of the questionnaires. Our analyses methods and results show great promise for utilizing EEG measures by managers, marketing practitioners, and researchers, as a valuable tool for predicting subjects’ preferences and marketing campaigns’ success.


PLoS ONE ◽  
2021 ◽  
Vol 16 (6) ◽  
pp. e0252392
Author(s):  
Jiaojiao Ji ◽  
Naipeng Chao ◽  
Shitong Wei ◽  
George A. Barnett

The considerable amount of misinformation on social media regarding genetically modified (GM) food will not only hinder public understanding but also mislead the public to make unreasoned decisions. This study discovered a new mechanism of misinformation diffusion in the case of GM food and applied a framework of supervised machine learning to identify effective credibility indicators for the misinformation prediction of GM food. Main indicators are proposed, including user identities involved in spreading information, linguistic styles, and propagation dynamics. Results show that linguistic styles, including sentiment and topics, have the dominant predictive power. In addition, among the user identities, engagement, and extroversion are effective predictors, while reputation has almost no predictive power in this study. Finally, we provide strategies that readers should be aware of when assessing the credibility of online posts and suggest improvements that Weibo can use to avoid rumormongering and enhance the science communication of GM food.


Sign in / Sign up

Export Citation Format

Share Document