scholarly journals Automatic analysis of insurance reports through deep neural networks to identify severe claims

2021 ◽  
pp. 1-26
Author(s):  
Isaac Cohen Sabban ◽  
Olivier Lopez ◽  
Yann Mercuzot

Abstract In this paper, we develop a methodology to automatically classify claims using the information contained in text reports (redacted at their opening). From this automatic analysis, the aim is to predict if a claim is expected to be particularly severe or not. The difficulty is the rarity of such extreme claims in the database, and hence the difficulty, for classical prediction techniques like logistic regression to accurately predict the outcome. Since data is unbalanced (too few observations are associated with a positive label), we propose different rebalance algorithm to deal with this issue. We discuss the use of different embedding methodologies used to process text data, and the role of the architectures of the networks.

2019 ◽  
Vol ISASE2019 (0) ◽  
pp. 1-5 ◽  
Author(s):  
Shoichi NISHIO ◽  
Belayat HOSSAIN ◽  
Manabu NII ◽  
Takafumi HIRANAKA ◽  
Syoji KOBASHI

2021 ◽  
pp. 27-38
Author(s):  
Rafaela Carvalho ◽  
João Pedrosa ◽  
Tudor Nedelcu

AbstractSkin cancer is one of the most common types of cancer and, with its increasing incidence, accurate early diagnosis is crucial to improve prognosis of patients. In the process of visual inspection, dermatologists follow specific dermoscopic algorithms and identify important features to provide a diagnosis. This process can be automated as such characteristics can be extracted by computer vision techniques. Although deep neural networks can extract useful features from digital images for skin lesion classification, performance can be improved by providing additional information. The extracted pseudo-features can be used as input (multimodal) or output (multi-tasking) to train a robust deep learning model. This work investigates the multimodal and multi-tasking techniques for more efficient training, given the single optimization of several related tasks in the latter, and generation of better diagnosis predictions. Additionally, the role of lesion segmentation is also studied. Results show that multi-tasking improves learning of beneficial features which lead to better predictions, and pseudo-features inspired by the ABCD rule provide readily available helpful information about the skin lesion.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0239007
Author(s):  
Aixia Guo ◽  
Sakima Smith ◽  
Yosef M. Khan ◽  
James R. Langabeer II ◽  
Randi E. Foraker

Background Cardiac dysrhythmias (CD) affect millions of Americans in the United States (US), and are associated with considerable morbidity and mortality. New strategies to combat this growing problem are urgently needed. Objectives Predicting CD using electronic health record (EHR) data would allow for earlier diagnosis and treatment of the condition, thus improving overall cardiovascular outcomes. The Guideline Advantage (TGA) is an American Heart Association ambulatory quality clinical data registry of EHR data representing 70 clinics distributed throughout the US, and has been used to monitor outpatient prevention and disease management outcome measures across populations and for longitudinal research on the impact of preventative care. Methods For this study, we represented all time-series cardiovascular health (CVH) measures and the corresponding data collection time points for each patient by numerical embedding vectors. We then employed a deep learning technique–long-short term memory (LSTM) model–to predict CD from the vector of time-series CVH measures by 5-fold cross validation and compared the performance of this model to the results of deep neural networks, logistic regression, random forest, and Naïve Bayes models. Results We demonstrated that the LSTM model outperformed other traditional machine learning models and achieved the best prediction performance as measured by the average area under the receiver operator curve (AUROC): 0.76 for LSTM, 0.71 for deep neural networks, 0.66 for logistic regression, 0.67 for random forest, and 0.59 for Naïve Bayes. The most influential feature from the LSTM model were blood pressure. Conclusions These findings may be used to prevent CD in the outpatient setting by encouraging appropriate surveillance and management of CVH.


2017 ◽  
Author(s):  
Eelke B. Lenselink ◽  
Niels ten Dijke ◽  
Brandon Bongers ◽  
George Papadatos ◽  
Herman W.T. van Vlijmen ◽  
...  

AbstractThe increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics.In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naive Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution.Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method (‘DNN_PCM’) performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized DNN_PCM).Here, a standardized set to test and evaluate different machine learning algorithms in the context of multitask learning is offered by providing the data and the protocols.


2018 ◽  
Vol 2018 (15) ◽  
pp. 851-855 ◽  
Author(s):  
Marcello Mastroleo ◽  
Roberto Ugolotti ◽  
Luca Mussi ◽  
Emilio Vicari ◽  
Federico Sassi ◽  
...  

Entropy ◽  
2020 ◽  
Vol 22 (1) ◽  
pp. 101
Author(s):  
Rita Fioresi ◽  
Pratik Chaudhari ◽  
Stefano Soatto

This paper is a step towards developing a geometric understanding of a popular algorithm for training deep neural networks named stochastic gradient descent (SGD). We built upon a recent result which observed that the noise in SGD while training typical networks is highly non-isotropic. That motivated a deterministic model in which the trajectories of our dynamical systems are described via geodesics of a family of metrics arising from a certain diffusion matrix; namely, the covariance of the stochastic gradients in SGD. Our model is analogous to models in general relativity: the role of the electromagnetic field in the latter is played by the gradient of the loss function of a deep network in the former.


2005 ◽  
Vol 55 (4) ◽  
pp. 403-426 ◽  
Author(s):  
Miklós Virág ◽  
Tamás Kristóf

The article attempts to answer the question whether or not the latest bankruptcy prediction techniques are more reliable than traditional mathematical-statistical ones in Hungary. Simulation experiments carried out on the database of the first Hungarian bankruptcy prediction model clearly prove that bankruptcy models built using artificial neural networks have higher classification accuracy than models created in the 1990s based on discriminant analysis and logistic regression analysis. The article presents the main results, analyses the reasons for the differences and presents constructive proposals concerning the further development of Hungarian bankruptcy prediction.


Sign in / Sign up

Export Citation Format

Share Document