Deterministic dropout for deep neural networks using composite random forest

2020 ◽  
Vol 131 ◽  
pp. 205-212
Author(s):  
Bikash Santra ◽  
Angshuman Paul ◽  
Dipti Prasad Mukherjee
PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0239007
Author(s):  
Aixia Guo ◽  
Sakima Smith ◽  
Yosef M. Khan ◽  
James R. Langabeer II ◽  
Randi E. Foraker

Background Cardiac dysrhythmias (CD) affect millions of Americans in the United States (US), and are associated with considerable morbidity and mortality. New strategies to combat this growing problem are urgently needed. Objectives Predicting CD using electronic health record (EHR) data would allow for earlier diagnosis and treatment of the condition, thus improving overall cardiovascular outcomes. The Guideline Advantage (TGA) is an American Heart Association ambulatory quality clinical data registry of EHR data representing 70 clinics distributed throughout the US, and has been used to monitor outpatient prevention and disease management outcome measures across populations and for longitudinal research on the impact of preventative care. Methods For this study, we represented all time-series cardiovascular health (CVH) measures and the corresponding data collection time points for each patient by numerical embedding vectors. We then employed a deep learning technique–long-short term memory (LSTM) model–to predict CD from the vector of time-series CVH measures by 5-fold cross validation and compared the performance of this model to the results of deep neural networks, logistic regression, random forest, and Naïve Bayes models. Results We demonstrated that the LSTM model outperformed other traditional machine learning models and achieved the best prediction performance as measured by the average area under the receiver operator curve (AUROC): 0.76 for LSTM, 0.71 for deep neural networks, 0.66 for logistic regression, 0.67 for random forest, and 0.59 for Naïve Bayes. The most influential feature from the LSTM model were blood pressure. Conclusions These findings may be used to prevent CD in the outpatient setting by encouraging appropriate surveillance and management of CVH.


Author(s):  
Alfonso T. García-Sosa

Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine learning classifiers and regressors and evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks on user-defined physicochemically-relevant features developed for this work outperform graph convolutional, random forest, and large featurizations. The results can help provide clues on risk of substances and better experimental design for toxicity assays. Source code and data are available at https://github.com/AlfonsoTGarcia-Sosa/ML


2020 ◽  
Vol 499 (3) ◽  
pp. 3130-3138
Author(s):  
Catalina Gómez ◽  
Mauricio Neira ◽  
Marcela Hernández Hoyos ◽  
Pablo Arbeláez ◽  
Jaime E Forero-Romero

ABSTRACT Supervised classification of temporal sequences of astronomical images into meaningful transient astrophysical phenomena has been considered a hard problem because it requires the intervention of human experts. The classifier uses the expert’s knowledge to find heuristic features to process the images, for instance, by performing image subtraction or by extracting sparse information such as flux time-series, also known as light curves. We present a successful deep learning approach that learns directly from imaging data. Our method models explicitly the spatiotemporal patterns with deep convolutional neural networks and gated recurrent units. We train these deep neural networks using 1.3 million real astronomical images from the Catalina Real-Time Transient Survey to classify the sequences into five different types of astronomical transient classes. The TAO-Net (for Transient Astronomical Objects Network) architecture outperforms the results from random forest classification on light curves by 10 percentage points as measured by the F1 score for each class; the average F1 over classes goes from $45{{\ \rm percent}}$ with random forest classification to $55{{\ \rm percent}}$ with TAO-Net. This achievement with TAO-Net opens the possibility to develop new deep learning architectures for early transient detection. We make available the training data set and trained models of TAO-Net to allow for future extensions of this work.


2021 ◽  
Author(s):  
Fahad Shabbir Ahmed ◽  
Furqan Bin Irfan

The aim of this study is to use machine learning to predict tumor staging and metastasis in melanoma with differentially expressed genes. Machine has been used in different clinical setting to predict different outcomes. However, it has not been used to look at predicting the diagnostic aspect of tumor staging. We used the TCGA RNA-Sequencing data on melanomas to predict tumor staging nodal and/or metastasis using deep neural networks (DNN) and random forest classifier (RF). Results: We were able to predict tumor staging (lower vs higher stage, i.e. Tis / T1 / T2 vs T3 and higher), nodal metastasis and combined nodal or distant metastasis in patients with melanomas with high accuracies. However, we need to further validate these results.


2019 ◽  
Vol 11 (3) ◽  
pp. 699 ◽  
Author(s):  
Lkhagvadorj Munkhdalai ◽  
Tsendsuren Munkhdalai ◽  
Oyun-Erdene Namsrai ◽  
Jong Lee ◽  
Keun Ryu

Machine learning and artificial intelligence have achieved a human-level performance in many application domains, including image classification, speech recognition and machine translation. However, in the financial domain expert-based credit risk models have still been dominating. Establishing meaningful benchmark and comparisons on machine-learning approaches and human expert-based models is a prerequisite in further introducing novel methods. Therefore, our main goal in this study is to establish a new benchmark using real consumer data and to provide machine-learning approaches that can serve as a baseline on this benchmark. We performed an extensive comparison between the machine-learning approaches and a human expert-based model—FICO credit scoring system—by using a Survey of Consumer Finances (SCF) data. As the SCF data is non-synthetic and consists of a large number of real variables, we applied two variable-selection methods: the first method used hypothesis tests, correlation and random forest-based feature importance measures and the second method was only a random forest-based new approach (NAP), to select the best representative features for effective modelling and to compare them. We then built regression models based on various machine-learning algorithms ranging from logistic regression and support vector machines to an ensemble of gradient boosted trees and deep neural networks. Our results demonstrated that if lending institutions in the 2001s had used their own credit scoring model constructed by machine-learning methods explored in this study, their expected credit losses would have been lower, and they would be more sustainable. In addition, the deep neural networks and XGBoost algorithms trained on the subset selected by NAP achieve the highest area under the curve (AUC) and accuracy, respectively.


Author(s):  
Alex Hernández-García ◽  
Johannes Mehrer ◽  
Nikolaus Kriegeskorte ◽  
Peter König ◽  
Tim C. Kietzmann

2018 ◽  
Author(s):  
Chi Zhang ◽  
Xiaohan Duan ◽  
Ruyuan Zhang ◽  
Li Tong

Sign in / Sign up

Export Citation Format

Share Document