Machine Learning Based Algorithms to Impute PaO 2 from SpO2 Values and Development of an Online Calculator

Abstract We created an online calculator using machine learning algorithms to impute the partial pressure of oxygen (PaO2)/fraction of delivered oxygen (FiO2) ratio using the non-invasive peripheral saturation of oxygen (SpO2) and compared the accuracy of the machine learning models we developed to previously published equations. We generated three machine learning algorithms (neural network, regression, and kernel-based methods) using 7 clinical variable features (N=9,900 ICU events) and subsequently 3 features (N=20,198 ICU events) as input into the models. Data from mechanically ventilated ICU patients were obtained from the publicly available Medical Information Mart for Intensive Care (MIMIC III) database and used for analysis. Compared to seven features, three features (SpO2, FiO2 and PEEP) were sufficient to impute PaO2 from the SpO2. Any of the tested machine learning models enabled imputation of PaO2 from the SpO2 with lower error and showed greater accuracy in predicting PaO2/FiO2 < 150 compared to the previously published log-linear and non-linear equations. Imputation using data from an independent validation cohort of ICU patients (N = 133) from 2 hospitals within the University of Pittsburgh Medical Center (UPMC) showed greater accuracy with the neural network and kernel-based machine learning models compared to the previously published non-linear equation.

Download Full-text

Machine Learning Models for Finger Bend Evaluation using Implemented Low cost Flex Sensor

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35742 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 3605-3611

Author(s):

Pratyush Kaware

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Low Cost ◽

Learning Algorithms ◽

Cost Effective ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.

Download Full-text

Imputation of PaO2 from SpO2 values from the MIMIC-III Critical Care Database Using Machine-Learning Based Algorithms

10.1101/2021.04.21.21255877 ◽

2021 ◽

Cited By ~ 1

Author(s):

Shuangxia Ren ◽

Jill Zupetic ◽

Mehdi Nouraie ◽

Xinghua Lu ◽

Richard D. Boyce ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Learning Models ◽

Mechanically Ventilated ◽

Mechanically Ventilated Patients ◽

Mimic Iii ◽

Ventilated Patients ◽

Machine Learning Models

AbstractBackgroundThe partial pressure of oxygen (PaO2)/fraction of oxygen delivered (FIO2) ratio is the reference standard for assessment of hypoxemia in mechanically ventilated patients. Non-invasive monitoring with the peripheral saturation of oxygen (SpO2) is increasingly utilized to estimate PaO2 because it does not require invasive sampling. Several equations have been reported to impute PaO2/FIO2 from SpO2 /FIO2. However, machine-learning algorithms to impute the PaO2 from the SpO2 has not been compared to published equations.Research QuestionHow do machine learning algorithms perform at predicting the PaO2 from SpO2 compared to previously published equations?MethodsThree machine learning algorithms (neural network, regression, and kernel-based methods) were developed using 7 clinical variable features (n=9,900 ICU events) and subsequently 3 features (n=20,198 ICU events) as input into the models from data available in mechanically ventilated patients from the Medical Information Mart for Intensive Care (MIMIC) III database. As a regression task, the machine learning models were used to impute PaO2 values. As a classification task, the models were used to predict patients with moderate-to-severe hypoxemic respiratory failure based on a clinically relevant cut-off of PaO2/FIO2 ≤ 150. The accuracy of the machine learning models was compared to published log-linear and non-linear equations. An online imputation calculator was created.ResultsCompared to seven features, three features (SpO2, FiO2 and PEEP) were sufficient to impute PaO2/FIO2 ratio using a large dataset. Any of the tested machine learning models enabled imputation of PaO2/FIO2 from the SpO2/FIO2 with lower error and had greater accuracy in predicting PaO2/FIO2 ≤ 150 compared to published equations. Using three features, the machine learning models showed superior performance in imputing PaO2 across the entire span of SpO2 values, including those ≥ 97%.InterpretationThe improved performance shown for the machine learning algorithms suggests a promising framework for future use in large datasets.

Download Full-text

Using Big Data-machine learning models for diabetes prediction and flight delays analytics

Journal Of Big Data ◽

10.1186/s40537-020-00355-0 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Thérence Nibareke ◽

Jalal Laassiri

Keyword(s):

Machine Learning ◽

Big Data ◽

Linear Regression ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Smart Devices ◽

Learning Models ◽

Flight Delays ◽

Machine Learning Models

Abstract Introduction Nowadays large data volumes are daily generated at a high rate. Data from health system, social network, financial, government, marketing, bank transactions as well as the censors and smart devices are increasing. The tools and models have to be optimized. In this paper we applied and compared Machine Learning algorithms (Linear Regression, Naïve bayes, Decision Tree) to predict diabetes. Further more, we performed analytics on flight delays. The main contribution of this paper is to give an overview of Big Data tools and machine learning models. We highlight some metrics that allow us to choose a more accurate model. We predict diabetes disease using three machine learning models and then compared their performance. Further more we analyzed flight delay and produced a dashboard which can help managers of flight companies to have a 360° view of their flights and take strategic decisions. Case description We applied three Machine Learning algorithms for predicting diabetes and we compared the performance to see what model give the best results. We performed analytics on flights datasets to help decision making and predict flight delays. Discussion and evaluation The experiment shows that the Linear Regression, Naive Bayesian and Decision Tree give the same accuracy (0.766) but Decision Tree outperforms the two other models with the greatest score (1) and the smallest error (0). For the flight delays analytics, the model could show for example the airport that recorded the most flight delays. Conclusions Several tools and machine learning models to deal with big data analytics have been discussed in this paper. We concluded that for the same datasets, we have to carefully choose the model to use in prediction. In our future works, we will test different models in other fields (climate, banking, insurance.).

Download Full-text

SCADA System Testbed for Cybersecurity Research Using Machine Learning Approach

Future Internet ◽

10.3390/fi10080076 ◽

2018 ◽

Vol 10 (8) ◽

pp. 76 ◽

Cited By ~ 12

Author(s):

Marcio Teixeira ◽

Tara Salman ◽

Maede Zolanvari ◽

Raj Jain ◽

Nader Meskin ◽

...

Keyword(s):

Machine Learning ◽

Supervisory Control ◽

Network Traffic ◽

Learning Algorithms ◽

Cyber Attacks ◽

Machine Learning Algorithms ◽

Learning Models ◽

Scada System ◽

Machine Learning Approach ◽

Machine Learning Models

This paper presents the development of a Supervisory Control and Data Acquisition (SCADA) system testbed used for cybersecurity research. The testbed consists of a water storage tank’s control system, which is a stage in the process of water treatment and distribution. Sophisticated cyber-attacks were conducted against the testbed. During the attacks, the network traffic was captured, and features were extracted from the traffic to build a dataset for training and testing different machine learning algorithms. Five traditional machine learning algorithms were trained to detect the attacks: Random Forest, Decision Tree, Logistic Regression, Naïve Bayes and KNN. Then, the trained machine learning models were built and deployed in the network, where new tests were made using online network traffic. The performance obtained during the training and testing of the machine learning models was compared to the performance obtained during the online deployment of these models in the network. The results show the efficiency of the machine learning models in detecting the attacks in real time. The testbed provides a good understanding of the effects and consequences of attacks on real SCADA environments.

Download Full-text

Software Engineering for Machine Learning in Health Informatics

10.21203/rs.2.17747/v1 ◽

2019 ◽

Author(s):

Mohammed Moreb ◽

Oguz Ata

Keyword(s):

Machine Learning ◽

Software Engineering ◽

Health Informatics ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Prototype System ◽

Machine Learning Algorithm ◽

Learning Models ◽

Machine Learning Models

Abstract Background We propose a novel framework for health Informatics: framework and methodology of Software Engineering for machine learning in Health Informatics (SEMLHI). This framework shed light on its features, that allow users to study and analyze the requirements, determine the function of objects related to the system and determine the machine learning algorithms that will be used for the dataset.Methods Based on original data that collected from the hospital in Palestine government in the past three years, first the data validated and all outlier removed, analyzed using develop framework in order to compare ML provide patients with real-time. Our proposed module comparison with three Systems Engineering Methods Vee, agile and SEMLHI. The result used by implement prototype system, which require machine learning algorithm, after development phase, questionnaire deliver to developer to indicate the result using three methodology. SEMLHI framework, is composed into four components: software, machine learning model, machine learning algorithms, and health informatics data, Machine learning Algorithm component used five algorithms use to evaluate the accuracy for machine learning models on component.Results we compare our approach with the previously published systems in terms of performance to evaluate the accuracy for machine learning models, the results of accuracy with different algorithms applied for 750 case, linear SVG have about 0.57 value compared with KNeighbors classifier, logistic regression, multinomial NB, random forest classifier. This research investigates the interaction between SE, and ML within the context of health informatics, our proposed framework define the methodology for developers to analyzing and developing software for the health informatic model, and create a space, in which software engineering, and ML experts could work on the ML model lifecycle, on the disease level and the subtype level.Conclusions This article is an ongoing effort towards defining and translating an existing research pipeline into four integrated modules, as framework system using the dataset from healthcare to reduce cost estimation by using a new suggested methodology. The framework is available as open source software, licensed under GNU General Public License Version 3 to encourage others to contribute to the future development of the SEMLHI framework.

Download Full-text

Data Augmentation and Pretraining for Template-Based Retrosynthetic Prediction in Computer-Aided Synthesis Planning

10.26434/chemrxiv.11811564.v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Michael Fortunato ◽

Connor W. Coley ◽

Brian Barnes ◽

Klavs F. Jensen

Keyword(s):

Neural Network ◽

Machine Learning ◽

Data Augmentation ◽

Machine Learning Algorithms ◽

Learning Models ◽

The Neural Network ◽

Computer Aided ◽

Synthesis Planning ◽

The One ◽

Machine Learning Models

This work presents efforts to augment the performance of data-driven machine learning algorithms for reaction template recommendation used in computer-aided synthesis planning software. Often, machine learning models designed to perform the task of prioritizing reaction templates or molecular transformations are focused on reporting high accuracy metrics for the one-to-one mapping of product molecules in reaction databases to the template extracted from the recorded reaction. The available templates that get selected for inclusion in these machine learning models have been previously limited to those that appear frequently in the reaction databases and exclude potentially useful transformations. By augmenting open-access datasets of organic reactions with artificially calculated template applicability and pretraining a template relevance neural network on this augmented applicability dataset, we report an increase in the template applicability recall and an increase in the diversity of predicted precursors. The augmentation and pretraining effectively teaches the neural network an increased set of templates that could theoretically lead to successful reactions for a given target. Even on a small dataset of well curated reactions, the data augmentation and pretraining methods resulted in an increase in top-1 accuracy, especially for rare templates, indicating these strategies can be very useful for small datasets.

Download Full-text

Integrating Feature Selection and Multiclass Classification for Sport Result Prediction

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3757.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1339-1342

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Multiclass Classification ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

Learning Models ◽

Sports Fans ◽

Player Performance ◽

Machine Learning Models

Machine learning (ML) has become the most predominant methodology that shows good results in the classification and prediction domains. Predictive systems are being employed to predict events and its results in almost every walk of life. The field of prediction in sports is gaining importance as there is a huge community of betters and sports fans. Moreover team owners and club managers are struggling for Machine learning models that could be used for formulating strategies to win matches. Numerous factors such as results of previous matches, indicators of player performance and opponent information are required to build these models. This paper provides an analysis of such key models focusing on application of machine learning algorithms to sport result prediction. The results obtained helped us to elucidate the best combination of feature selection and classification algorithms that render maximum accuracy in sport result prediction.

Download Full-text

Machine Learning Models to Predict 30-Day Mortality in Mechanically Ventilated Patients

Journal of Clinical Medicine ◽

10.3390/jcm10102172 ◽

2021 ◽

Vol 10 (10) ◽

pp. 2172

Author(s):

Jong Ho Kim ◽

Young Suk Kwon ◽

Moon Seong Baek

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Apache Ii ◽

Apache Ii Score ◽

Learning Models ◽

Mechanically Ventilated ◽

Mechanically Ventilated Patients ◽

Ventilated Patients ◽

Machine Learning Models

Previous scoring models, such as the Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II) score, do not adequately predict the mortality of patients receiving mechanical ventilation in the intensive care unit. Therefore, this study aimed to apply machine learning algorithms to improve the prediction accuracy for 30-day mortality of mechanically ventilated patients. The data of 16,940 mechanically ventilated patients were divided into the training-validation (83%, n = 13,988) and test (17%, n = 2952) sets. Machine learning algorithms including balanced random forest, light gradient boosting machine, extreme gradient boost, multilayer perceptron, and logistic regression were used. We compared the area under the receiver operating characteristic curves (AUCs) of machine learning algorithms with those of the APACHE II and ProVent score results. The extreme gradient boost model showed the highest AUC (0.79 (0.77–0.80)) for the 30-day mortality prediction, followed by the balanced random forest model (0.78 (0.76–0.80)). The AUCs of these machine learning models as achieved by APACHE II and ProVent scores were higher than 0.67 (0.65–0.69), and 0.69 (0.67–0.71)), respectively. The most important variables in developing each machine learning model were APACHE II score, Charlson comorbidity index, and norepinephrine. The machine learning models have a higher AUC than conventional scoring systems, and can thus better predict the 30-day mortality of mechanically ventilated patients.

Download Full-text

Mood Detection from Physical and Neurophysical Data Using Deep Learning Models

Complexity ◽

10.1155/2019/6434578 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15

Author(s):

Zeynep Hilal Kilimci ◽

Aykut Güven ◽

Mitat Uysal ◽

Selim Akyokus

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Behavioral Data ◽

Support Vector ◽

Learning Models ◽

Physical Data ◽

Conventional Machine

Nowadays, smart devices as a part of daily life collect data about their users with the help of sensors placed on them. Sensor data are usually physical data but mobile applications collect more than physical data like device usage habits and personal interests. Collected data are usually classified as personal, but they contain valuable information about their users when it is analyzed and interpreted. One of the main purposes of personal data analysis is to make predictions about users. Collected data can be divided into two major categories: physical and behavioral data. Behavioral data are also named as neurophysical data. Physical and neurophysical parameters are collected as a part of this study. Physical data contains measurements of the users like heartbeats, sleep quality, energy, movement/mobility parameters. Neurophysical data contain keystroke patterns like typing speed and typing errors. Users’ emotional/mood statuses are also investigated by asking daily questions. Six questions are asked to the users daily in order to determine the mood of them. These questions are emotion-attached questions, and depending on the answers, users’ emotional states are graded. Our aim is to show that there is a connection between users’ physical/neurophysical parameters and mood/emotional conditions. To prove our hypothesis, we collect and measure physical and neurophysical parameters of 15 users for 1 year. The novelty of this work to the literature is the usage of both combinations of physical and neurophysical parameters. Another novelty is that the emotion classification task is performed by both conventional machine learning algorithms and deep learning models. For this purpose, Feedforward Neural Network (FFNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM) neural network are employed as deep learning methodologies. Multinomial Naïve Bayes (MNB), Support Vector Regression (SVR), Decision Tree (DT), Random Forest (RF), and Decision Integration Strategy (DIS) are evaluated as conventional machine learning algorithms. To the best of our knowledge, this is the very first attempt to analyze the neurophysical conditions of the users by evaluating deep learning models for mood analysis and enriching physical characteristics with neurophysical parameters. Experiment results demonstrate that the utilization of deep learning methodologies and the combination of both physical and neurophysical parameters enhances the classification success of the system to interpret the mood of the users. A wide range of comparative and extensive experiments shows that the proposed model exhibits noteworthy results compared to the state-of-art studies.

Download Full-text

An Empirical Study for Adopting Machine Learning Approaches for Gas Pipeline Flow Prediction

Mathematical Problems in Engineering ◽

10.1155/2020/7842847 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Guoliang Shen ◽

Mufan Li ◽

Jiale Lin ◽

Jie Bao ◽

Tao He

Keyword(s):

Machine Learning ◽

Flow Rate ◽

Learning Algorithms ◽

Gas Pipeline ◽

Machine Learning Algorithms ◽

Physical Models ◽

Pipeline System ◽

Learning Models ◽

Industrial Control ◽

Machine Learning Models

As industrial control technology continues to develop, modern industrial control is undergoing a transformation from manual control to automatic control. In this paper, we show how to evaluate and build machine learning models to predict the flow rate of the gas pipeline accurately. Compared with traditional practice by experts or rules, machine learning models rely little on the expertise of special fields and extensive physical mechanism analysis. Specifically, we devised a method that can automate the process of choosing suitable machine learning algorithms and their hyperparameters by automatically testing different machine learning algorithms on given data. Our proposed methods are used in choosing the appropriate learning algorithm and hyperparameters to build the model of the flow rate of the gas pipeline. Based on this, the model can be further used for control of the gas pipeline system. The experiments conducted on real industrial data show the feasibility of building accurate models with machine learning algorithms. The merits of our approach include (1) little dependence on the expertise of special fields and domain knowledge-based analysis; (2) easy to implement than physical models; (3) more robust to environment changes; (4) requiring much fewer computation resources when it is compared with physical models that call for complex equation solving. Moreover, our experiments also show that some simple yet powerful learning algorithms may outperform industrial control problems than those complex algorithms.

Download Full-text