scholarly journals Improving Current Glycated Hemoglobin Prediction in Adults: Consistency and Robustness of Machine Learning Algorithms with Electronic Health Records (Preprint)

2020 ◽  
Author(s):  
Zakhriya Alhassan ◽  
MATTHEW WATSON ◽  
David Budgen ◽  
Riyad Alshammari ◽  
Ali Alessan ◽  
...  

BACKGROUND Predicting the risk of glycated hemoglobin (HbA1c) elevation can help identify patients with the potential for developing serious chronic health problems such as diabetes and cardiovascular diseases. Early preventive interventions based upon advanced predictive models using electronic health records (EHR) data for such patients can ultimately help provide better health outcomes. OBJECTIVE Our study investigates the performance of predictive models to forecast HbA1c elevation levels by employing machine learning approaches using data from current and previous visits in the EHR systems for patients who had not been previously diagnosed with any type of diabetes. METHODS This study employed one statistical model and three commonly used conventional machine learning models, as well as a deep learning model, to predict patients’ current levels of HbA1c. For the deep learning model, we also integrated current visit data with historical (longitudinal) data from previous visits. Explainable machine learning methods were used to interrogate the models and have an understanding of the reasons behind the models' decisions. All models were trained and tested using a large and naturally balanced dataset from Saudi Arabia with 18,844 unique patient records. RESULTS The machine learning models achieved the best results for predicting current HbA1c elevation risk. The deep learning model outperformed the statistical and conventional machine learning models with respect to all reported measures when employing time-series data. The best performing model was the multi-layer perceptron (MLP) which achieved an accuracy of 74.52% when used with historical data. CONCLUSIONS This study shows that machine learning models can provide promising results for the task of predicting current HbA1c levels. For deep learning in particular, utilizing the patient's longitudinal time-series data improved the performance and affected the relative importance for the predictors used. The models showed robust results that were consistent with comparable studies.

2019 ◽  
Author(s):  
Mojtaba Haghighatlari ◽  
Gaurav Vishwakarma ◽  
Mohammad Atif Faiz Afzal ◽  
Johannes Hachmann

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>


2019 ◽  
Author(s):  
Mojtaba Haghighatlari ◽  
Gaurav Vishwakarma ◽  
Mohammad Atif Faiz Afzal ◽  
Johannes Hachmann

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Bader Alouffi ◽  
Abdullah Alharbi ◽  
Radhya Sahal ◽  
Hager Saleh

Fake news is challenging to detect due to mixing accurate and inaccurate information from reliable and unreliable sources. Social media is a data source that is not trustworthy all the time, especially in the COVID-19 outbreak. During the COVID-19 epidemic, fake news is widely spread. The best way to deal with this is early detection. Accordingly, in this work, we have proposed a hybrid deep learning model that uses convolutional neural network (CNN) and long short-term memory (LSTM) to detect COVID-19 fake news. The proposed model consists of some layers: an embedding layer, a convolutional layer, a pooling layer, an LSTM layer, a flatten layer, a dense layer, and an output layer. For experimental results, three COVID-19 fake news datasets are used to evaluate six machine learning models, two deep learning models, and our proposed model. The machine learning models are DT, KNN, LR, RF, SVM, and NB, while the deep learning models are CNN and LSTM. Also, four matrices are used to validate the results: accuracy, precision, recall, and F1-measure. The conducted experiments show that the proposed model outperforms the six machine learning models and the two deep learning models. Consequently, the proposed system is capable of detecting the fake news of COVID-19 significantly.


2020 ◽  
Author(s):  
Hirofumi Obinata ◽  
Peiying Ruan ◽  
Hitoshi Mori ◽  
Wentao Zhu ◽  
Hisashi Sasaki ◽  
...  

Abstract This study investigated the utility of artificial intelligence in predicting disease progression. We analysed 194 patients with COVID-19 confirmed by reverse transcription polymerase chain reaction. Among them, 31 patients had oxygen therapy administered after admission. To assess the utility of artificial intelligence in the prediction of disease progression, we used three machine learning models employing clinical features (patient’s background, laboratory data, and symptoms), one deep learning model employing computed tomography (CT) images, and one multimodal deep learning model employing a combination of clinical features and CT images. We also evaluated the predictive values of these models and analysed the important features required to predict worsening in cases of COVID-19. The multimodal deep learning model had the highest accuracy. The CT image was an important feature of multimodal deep learning model. The area under the curve of all machine learning models employing clinical features and the deep learning model employing CT images exceeded 90%, and sensitivity of these models exceeded 95%. C-reactive protein and lactate dehydrogenase were important features of machine learning models. Our machine learning model, while slightly less accurate than the multimodal model, still provides a valuable medical triage tool for patients in the early stages of COVID-19.


2021 ◽  
Author(s):  
Erik Otović ◽  
Marko Njirjak ◽  
Dario Jozinović ◽  
Goran Mauša ◽  
Alberto Michelini ◽  
...  

&lt;p&gt;In this study, we compared the performance of machine learning models trained using transfer learning and those that were trained from scratch - on time series data. Four machine learning models were used for the experiment. Two models were taken from the field of seismology, and the other two are general-purpose models for working with time series data. The accuracy of selected models was systematically observed and analyzed when switching within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). In seismology, we used two databases of local earthquakes (one in counts, and the other with the instrument response removed) and a database of global earthquakes for predicting earthquake magnitude; other datasets targeted classifying spoken words (speech), predicting stock prices (finance) and classifying muscle movement from EMG signals (medicine).&lt;br&gt;In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model. Therefore, in our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic such conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that used transfer learning for training and those that were trained from scratch. We compared the performances between pairs of models in order to draw conclusions about the utility of transfer learning. In order to confirm the validity of the obtained results, we repeated the experiments several times and applied statistical tests to confirm the significance of the results. The study shows when, within the set experimental framework, the transfer of knowledge brought improvements in terms of model accuracy and in terms of model convergence rate.&lt;br&gt;&lt;br&gt;Our results show that it is possible to achieve better performance and faster convergence by transferring knowledge from the domain of global earthquakes to the domain of local earthquakes; sometimes also vice versa. However, improvements in seismology can sometimes also be achieved by transferring knowledge from medical and audio domains. The results show that the transfer of knowledge between other domains brought even more significant improvements, compared to those within the field of seismology. For example, it has been shown that models in the field of sound recognition have achieved much better performance compared to classical models and that the domain of sound recognition is very compatible with knowledge from other domains. We came to similar conclusions for the domains of medicine and finance. Ultimately, the paper offers suggestions when transfer learning is useful, and the explanations offered can provide a good starting point for knowledge transfer using time series data.&lt;/p&gt;


2020 ◽  
Vol 12 (12) ◽  
pp. 5074
Author(s):  
Jiyoung Woo ◽  
Jaeseok Yun

Spam posts in web forum discussions cause user inconvenience and lower the value of the web forum as an open source of user opinion. In this regard, as the importance of a web post is evaluated in terms of the number of involved authors, noise distorts the analysis results by adding unnecessary data to the opinion analysis. Here, in this work, an automatic detection model for spam posts in web forums using both conventional machine learning and deep learning is proposed. To automatically differentiate between normal posts and spam, evaluators were asked to recognize spam posts in advance. To construct the machine learning-based model, text features from posted content using text mining techniques from the perspective of linguistics were extracted, and supervised learning was performed to distinguish content noise from normal posts. For the deep learning model, raw text including and excluding special characters was utilized. A comparison analysis on deep neural networks using the two different recurrent neural network (RNN) models of the simple RNN and long short-term memory (LSTM) network was also performed. Furthermore, the proposed model was applied to two web forums. The experimental results indicate that the deep learning model affords significant improvements over the accuracy of conventional machine learning associated with text features. The accuracy of the proposed model using LSTM reaches 98.56%, and the precision and recall of the noise class reach 99% and 99.53%, respectively.


Author(s):  
S. Sasikala ◽  
S. J. Subhashini ◽  
P. Alli ◽  
J. Jane Rubel Angelina

Machine learning is a technique of parsing data, learning from that data, and then applying what has been learned to make informed decisions. Deep learning is actually a subset of machine learning. It technically is machine learning and functions in the same way, but it has different capabilities. The main difference between deep and machine learning is, machine learning models become well progressively, but the model still needs some guidance. If a machine learning model returns an inaccurate prediction, then the programmer needs to fix that problem explicitly, but in the case of deep learning, the model does it by itself. Automatic car driving system is a good example of deep learning. On other hand, Artificial Intelligence is a different thing from machine learning and deep learning. Deep learning and machine learning both are the subsets of AI.


Sign in / Sign up

Export Citation Format

Share Document