scholarly journals Feature Importance Analysis for Customer Management of Insurance Products

Author(s):  
Misbah Sohail ◽  
Pedro Peres ◽  
Yuhua Li
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chowdhury Rafeed Rahman ◽  
Ruhul Amin ◽  
Swakkhar Shatabda ◽  
Md. Sadrul Islam Toaha

AbstractDNA N6-methylation (6mA) in Adenine nucleotide is a post replication modification responsible for many biological functions. Automated and accurate computational methods can help to identify 6mA sites in long genomes saving significant time and money. Our study develops a convolutional neural network (CNN) based tool i6mA-CNN capable of identifying 6mA sites in the rice genome. Our model coordinates among multiple types of features such as PseAAC (Pseudo Amino Acid Composition) inspired customized feature vector, multiple one hot representations and dinucleotide physicochemical properties. It achieves auROC (area under Receiver Operating Characteristic curve) score of 0.98 with an overall accuracy of 93.97% using fivefold cross validation on benchmark dataset. Finally, we evaluate our model on three other plant genome 6mA site identification test datasets. Results suggest that our proposed tool is able to generalize its ability of 6mA site identification on plant genomes irrespective of plant species. An algorithm for potential motif extraction and a feature importance analysis procedure are two by products of this research. Web tool for this research can be found at: https://cutt.ly/dgp3QTR.


2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1620.1-1621
Author(s):  
J. Lee ◽  
H. Kim ◽  
S. Y. Kang ◽  
S. Lee ◽  
Y. H. Eun ◽  
...  

Background:Tumor necrosis factor (TNF) inhibitors are important drugs in treating patients with ankylosing spondylitis (AS). However, they are not used as a first-line treatment for AS. There is an insufficient treatment response to the first-line treatment, non-steroidal anti-inflammatory drugs (NSAIDs), in over 40% of patients. If we can predict who will need TNF inhibitors at an earlier phase, adequate treatment can be provided at an appropriate time and potential damages can be avoided. There is no precise predictive model at present. Recently, various machine learning methods show great performances in predictions using clinical data.Objectives:We aim to generate an artificial neural network (ANN) model to predict early TNF inhibitor users in patients with ankylosing spondylitis.Methods:The baseline demographic and laboratory data of patients who visited Samsung Medical Center rheumatology clinic from Dec. 2003 to Sep. 2018 were analyzed. Patients were divided into two groups: early TNF inhibitor users treated by TNF inhibitors within six months of their follow-up (early-TNF users), and the others (non-early-TNF users). Machine learning models were formulated to predict the early-TNF users using the baseline data. Additionally, feature importance analysis was performed to delineate significant baseline characteristics.Results:The numbers of early-TNF and non-early-TNF users were 90 and 509, respectively. The best performing ANN model utilized 3 hidden layers with 50 hidden nodes each; its performance (area under curve (AUC) = 0.75) was superior to logistic regression model, support vector machine, and random forest model (AUC = 0.72, 0.65, and 0.71, respectively) in predicting early-TNF users. Feature importance analysis revealed erythrocyte sedimentation rate (ESR), C-reactive protein (CRP), and height as the top significant baseline characteristics for predicting early-TNF users. Among these characteristics, height was revealed by machine learning models but not by conventional statistical techniques.Conclusion:Our model displayed superior performance in predicting early TNF users compared with logistic regression and other machine learning models. Machine learning can be a vital tool in predicting treatment response in various rheumatologic diseases.Disclosure of Interests:None declared


2021 ◽  
Author(s):  
Hon-Yi Shi ◽  
King-The Lee ◽  
Chong-Chi Chiu ◽  
Jhi-Joung Wang ◽  
Ding-Ping Sun ◽  
...  

Abstract BackgroundRisk of hepatocellular carcinoma (HCC) recurrence after surgical resection is unknown. Therefore, the aim of this study was 5-year recurrence prediction after HCC resection using deep learning and Cox regression models.MethodsThis study recruited 520 HCC patients who had undergone surgical resection at three medical centers in southern Taiwan between April, 2011, and December, 2015. Two popular deep learning algorithms: a deep neural network (DNN) model and a recurrent neural network (RNN) model and a Cox proportional hazard (CPH) regression model were designed to solve both classification problems and regression problems in predicting HCC recurrence. A feature importance analysis was also performed to identify confounding factors in the prediction of HCC recurrence in patients who had undergone resection.ResultsAll performance indices for the DNN model were significantly higher than those for the RNN model and the traditional CPH model (p<0.001). The most important confounding factor in 5-year recurrence after HCC resection was surgeon volume followed by, in order of importance, hospital volume, preoperative Beck Depression Scale score, preoperative Beck Anxiety Scale score, co-residence with family, tumor stage, and tumor size. ConclusionsThe DNN model is useful for early baseline prediction of 5-year recurrence after HCC resection. Its prediction accuracy can be improved by further training with temporal data collected from treated patients. The feature importance analysis performed in this study to investigate model interpretability provided important insights into the potential use of deep learning models for predicting recurrence after HCC resection and for identifying predictors of recurrence.


Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1784
Author(s):  
Shih-Chieh Chang ◽  
Chan-Lin Chu ◽  
Chih-Kuang Chen ◽  
Hsiang-Ning Chang ◽  
Alice M. K. Wong ◽  
...  

Prediction of post-stroke functional outcomes is crucial for allocating medical resources. In this study, a total of 577 patients were enrolled in the Post-Acute Care-Cerebrovascular Disease (PAC-CVD) program, and 77 predictors were collected at admission. The outcome was whether a patient could achieve a Barthel Index (BI) score of >60 upon discharge. Eight machine-learning (ML) methods were applied, and their results were integrated by stacking method. The area under the curve (AUC) of the eight ML models ranged from 0.83 to 0.887, with random forest, stacking, logistic regression, and support vector machine demonstrating superior performance. The feature importance analysis indicated that the initial Berg Balance Test (BBS-I), initial BI (BI-I), and initial Concise Chinese Aphasia Test (CCAT-I) were the top three predictors of BI scores at discharge. The partial dependence plot (PDP) and individual conditional expectation (ICE) plot indicated that the predictors’ ability to predict outcomes was the most pronounced within a specific value range (e.g., BBS-I < 40 and BI-I < 60). BI at discharge could be predicted by information collected at admission with the aid of various ML models, and the PDP and ICE plots indicated that the predictors could predict outcomes at a certain value range.


2021 ◽  
pp. 81-92
Author(s):  
Breno Lívio Silva de Almeida ◽  
Alvaro Pedroso Queiroz ◽  
Anderson Paulo Avila Santos ◽  
Robson Parmezan Bonidia ◽  
Ulisses Nunes da Rocha ◽  
...  

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Yoichi Kurumida ◽  
Yutaka Saito ◽  
Tomoshi Kameda

Abstract Antibodies are proteins working in our immune system with high affinity and specificity for target antigens, making them excellent tools for both biotherapeutic and bioengineering applications. The prediction of antibody affinity changes upon mutations ($${{\Delta \Delta {\mathrm{G}}}}_{\mathrm{binding}}$$ Δ Δ G binding ) is important for antibody engineering. Numerous computational methods have been proposed based on different approaches including molecular mechanics and machine learning. However, the accuracy by each individual predictor is not enough for efficient antibody development. In this study, we develop a new prediction method by combining multiple predictors based on machine learning. Our method was tested on the SiPMAB database, evaluating the Pearson’s correlation coefficient between predicted and experimental $${{\Delta \Delta {\mathrm{G}}}}_{\mathrm{binding}}$$ Δ Δ G binding . Our method achieved higher accuracy (R = 0.69) than previous molecular mechanics or machine-learning based methods (R = 0.59) and the previous method using the average of multiple predictors (R = 0.64). Feature importance analysis indicated that the improved accuracy was obtained by combining predictors with different importance, which have different protocols for calculating energies and for generating mutant and unbound state structures. This study demonstrates that machine learning is a powerful framework for combining different approaches to predict antibody affinity changes.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Seulkee Lee ◽  
Yeonghee Eun ◽  
Hyungjin Kim ◽  
Hoon-Suk Cha ◽  
Eun-Mi Koh ◽  
...  

AbstractWe aim to generate an artificial neural network (ANN) model to predict early TNF inhibitor users in patients with ankylosing spondylitis. The baseline demographic and laboratory data of patients who visited Samsung Medical Center rheumatology clinic from Dec. 2003 to Sep. 2018 were analyzed. Patients were divided into two groups: early-TNF and non-early-TNF users. Machine learning models were formulated to predict the early-TNF users using the baseline data. Feature importance analysis was performed to delineate significant baseline characteristics. The numbers of early-TNF and non-early-TNF users were 90 and 505, respectively. The performance of the ANN model, based on the area under curve (AUC) for a receiver operating characteristic curve (ROC) of 0.783, was superior to logistic regression, support vector machine, random forest, and XGBoost models (for an ROC curve of 0.719, 0.699, 0.761, and 0.713, respectively) in predicting early-TNF users. Feature importance analysis revealed CRP and ESR as the top significant baseline characteristics for predicting early-TNF users. Our model displayed superior performance in predicting early-TNF users compared with logistic regression and other machine learning models. Machine learning can be a vital tool in predicting treatment response in various rheumatologic diseases.


2020 ◽  
Vol 8 (10) ◽  
pp. 1591
Author(s):  
Nadia Bykova ◽  
Nikita Litovka ◽  
Anna Popenko ◽  
Sergey Musienko

(1) Background: microbiome host classification can be used to identify sources of contamination in environmental data. However, there is no ready-to-use host classifier. Here, we aimed to build a model that would be able to discriminate between pet and human microbiomes samples. The challenge of the study was to build a classifier using data solely from publicly available studies that normally contain sequencing data for only one type of host. (2) Results: we have developed a random forest model that distinguishes human microbiota from domestic pet microbiota (cats and dogs) with 97% accuracy. In order to prevent overfitting, samples from several (at least four) different projects were necessary. Feature importance analysis revealed that the model relied on several taxa known to be key components in domestic cat and dog microbiomes (such as Fusobacteriaceae and Peptostreptococcaeae), as well as on some taxa exclusively found in humans (as Akkermansiaceae). (3) Conclusion: we have shown that it is possible to make a reliable pet/human gut microbiome classifier on the basis of the data collected from different studies.


Sign in / Sign up

Export Citation Format

Share Document