scholarly journals Imputation of PaO2 from SpO2 values from the MIMIC-III Critical Care Database Using Machine-Learning Based Algorithms

Author(s):  
Shuangxia Ren ◽  
Jill Zupetic ◽  
Mehdi Nouraie ◽  
Xinghua Lu ◽  
Richard D. Boyce ◽  
...  

AbstractBackgroundThe partial pressure of oxygen (PaO2)/fraction of oxygen delivered (FIO2) ratio is the reference standard for assessment of hypoxemia in mechanically ventilated patients. Non-invasive monitoring with the peripheral saturation of oxygen (SpO2) is increasingly utilized to estimate PaO2 because it does not require invasive sampling. Several equations have been reported to impute PaO2/FIO2 from SpO2 /FIO2. However, machine-learning algorithms to impute the PaO2 from the SpO2 has not been compared to published equations.Research QuestionHow do machine learning algorithms perform at predicting the PaO2 from SpO2 compared to previously published equations?MethodsThree machine learning algorithms (neural network, regression, and kernel-based methods) were developed using 7 clinical variable features (n=9,900 ICU events) and subsequently 3 features (n=20,198 ICU events) as input into the models from data available in mechanically ventilated patients from the Medical Information Mart for Intensive Care (MIMIC) III database. As a regression task, the machine learning models were used to impute PaO2 values. As a classification task, the models were used to predict patients with moderate-to-severe hypoxemic respiratory failure based on a clinically relevant cut-off of PaO2/FIO2 ≤ 150. The accuracy of the machine learning models was compared to published log-linear and non-linear equations. An online imputation calculator was created.ResultsCompared to seven features, three features (SpO2, FiO2 and PEEP) were sufficient to impute PaO2/FIO2 ratio using a large dataset. Any of the tested machine learning models enabled imputation of PaO2/FIO2 from the SpO2/FIO2 with lower error and had greater accuracy in predicting PaO2/FIO2 ≤ 150 compared to published equations. Using three features, the machine learning models showed superior performance in imputing PaO2 across the entire span of SpO2 values, including those ≥ 97%.InterpretationThe improved performance shown for the machine learning algorithms suggests a promising framework for future use in large datasets.

2021 ◽  
Vol 10 (10) ◽  
pp. 2172
Author(s):  
Jong Ho Kim ◽  
Young Suk Kwon ◽  
Moon Seong Baek

Previous scoring models, such as the Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II) score, do not adequately predict the mortality of patients receiving mechanical ventilation in the intensive care unit. Therefore, this study aimed to apply machine learning algorithms to improve the prediction accuracy for 30-day mortality of mechanically ventilated patients. The data of 16,940 mechanically ventilated patients were divided into the training-validation (83%, n = 13,988) and test (17%, n = 2952) sets. Machine learning algorithms including balanced random forest, light gradient boosting machine, extreme gradient boost, multilayer perceptron, and logistic regression were used. We compared the area under the receiver operating characteristic curves (AUCs) of machine learning algorithms with those of the APACHE II and ProVent score results. The extreme gradient boost model showed the highest AUC (0.79 (0.77–0.80)) for the 30-day mortality prediction, followed by the balanced random forest model (0.78 (0.76–0.80)). The AUCs of these machine learning models as achieved by APACHE II and ProVent scores were higher than 0.67 (0.65–0.69), and 0.69 (0.67–0.71)), respectively. The most important variables in developing each machine learning model were APACHE II score, Charlson comorbidity index, and norepinephrine. The machine learning models have a higher AUC than conventional scoring systems, and can thus better predict the 30-day mortality of mechanically ventilated patients.


Author(s):  
Pratyush Kaware

In this paper a cost-effective sensor has been implemented to read finger bend signals, by attaching the sensor to a finger, so as to classify them based on the degree of bent as well as the joint about which the finger was being bent. This was done by testing with various machine learning algorithms to get the most accurate and consistent classifier. Finally, we found that Support Vector Machine was the best algorithm suited to classify our data, using we were able predict live state of a finger, i.e., the degree of bent and the joints involved. The live voltage values from the sensor were transmitted using a NodeMCU micro-controller which were converted to digital and uploaded on a database for analysis.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Thérence Nibareke ◽  
Jalal Laassiri

Abstract Introduction Nowadays large data volumes are daily generated at a high rate. Data from health system, social network, financial, government, marketing, bank transactions as well as the censors and smart devices are increasing. The tools and models have to be optimized. In this paper we applied and compared Machine Learning algorithms (Linear Regression, Naïve bayes, Decision Tree) to predict diabetes. Further more, we performed analytics on flight delays. The main contribution of this paper is to give an overview of Big Data tools and machine learning models. We highlight some metrics that allow us to choose a more accurate model. We predict diabetes disease using three machine learning models and then compared their performance. Further more we analyzed flight delay and produced a dashboard which can help managers of flight companies to have a 360° view of their flights and take strategic decisions. Case description We applied three Machine Learning algorithms for predicting diabetes and we compared the performance to see what model give the best results. We performed analytics on flights datasets to help decision making and predict flight delays. Discussion and evaluation The experiment shows that the Linear Regression, Naive Bayesian and Decision Tree give the same accuracy (0.766) but Decision Tree outperforms the two other models with the greatest score (1) and the smallest error (0). For the flight delays analytics, the model could show for example the airport that recorded the most flight delays. Conclusions Several tools and machine learning models to deal with big data analytics have been discussed in this paper. We concluded that for the same datasets, we have to carefully choose the model to use in prediction. In our future works, we will test different models in other fields (climate, banking, insurance.).


2018 ◽  
Vol 10 (8) ◽  
pp. 76 ◽  
Author(s):  
Marcio Teixeira ◽  
Tara Salman ◽  
Maede Zolanvari ◽  
Raj Jain ◽  
Nader Meskin ◽  
...  

This paper presents the development of a Supervisory Control and Data Acquisition (SCADA) system testbed used for cybersecurity research. The testbed consists of a water storage tank’s control system, which is a stage in the process of water treatment and distribution. Sophisticated cyber-attacks were conducted against the testbed. During the attacks, the network traffic was captured, and features were extracted from the traffic to build a dataset for training and testing different machine learning algorithms. Five traditional machine learning algorithms were trained to detect the attacks: Random Forest, Decision Tree, Logistic Regression, Naïve Bayes and KNN. Then, the trained machine learning models were built and deployed in the network, where new tests were made using online network traffic. The performance obtained during the training and testing of the machine learning models was compared to the performance obtained during the online deployment of these models in the network. The results show the efficiency of the machine learning models in detecting the attacks in real time. The testbed provides a good understanding of the effects and consequences of attacks on real SCADA environments.


2019 ◽  
Author(s):  
Mohammed Moreb ◽  
Oguz Ata

Abstract Background We propose a novel framework for health Informatics: framework and methodology of Software Engineering for machine learning in Health Informatics (SEMLHI). This framework shed light on its features, that allow users to study and analyze the requirements, determine the function of objects related to the system and determine the machine learning algorithms that will be used for the dataset.Methods Based on original data that collected from the hospital in Palestine government in the past three years, first the data validated and all outlier removed, analyzed using develop framework in order to compare ML provide patients with real-time. Our proposed module comparison with three Systems Engineering Methods Vee, agile and SEMLHI. The result used by implement prototype system, which require machine learning algorithm, after development phase, questionnaire deliver to developer to indicate the result using three methodology. SEMLHI framework, is composed into four components: software, machine learning model, machine learning algorithms, and health informatics data, Machine learning Algorithm component used five algorithms use to evaluate the accuracy for machine learning models on component.Results we compare our approach with the previously published systems in terms of performance to evaluate the accuracy for machine learning models, the results of accuracy with different algorithms applied for 750 case, linear SVG have about 0.57 value compared with KNeighbors classifier, logistic regression, multinomial NB, random forest classifier. This research investigates the interaction between SE, and ML within the context of health informatics, our proposed framework define the methodology for developers to analyzing and developing software for the health informatic model, and create a space, in which software engineering, and ML experts could work on the ML model lifecycle, on the disease level and the subtype level.Conclusions This article is an ongoing effort towards defining and translating an existing research pipeline into four integrated modules, as framework system using the dataset from healthcare to reduce cost estimation by using a new suggested methodology. The framework is available as open source software, licensed under GNU General Public License Version 3 to encourage others to contribute to the future development of the SEMLHI framework.


Machine learning (ML) has become the most predominant methodology that shows good results in the classification and prediction domains. Predictive systems are being employed to predict events and its results in almost every walk of life. The field of prediction in sports is gaining importance as there is a huge community of betters and sports fans. Moreover team owners and club managers are struggling for Machine learning models that could be used for formulating strategies to win matches. Numerous factors such as results of previous matches, indicators of player performance and opponent information are required to build these models. This paper provides an analysis of such key models focusing on application of machine learning algorithms to sport result prediction. The results obtained helped us to elucidate the best combination of feature selection and classification algorithms that render maximum accuracy in sport result prediction.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Guoliang Shen ◽  
Mufan Li ◽  
Jiale Lin ◽  
Jie Bao ◽  
Tao He

As industrial control technology continues to develop, modern industrial control is undergoing a transformation from manual control to automatic control. In this paper, we show how to evaluate and build machine learning models to predict the flow rate of the gas pipeline accurately. Compared with traditional practice by experts or rules, machine learning models rely little on the expertise of special fields and extensive physical mechanism analysis. Specifically, we devised a method that can automate the process of choosing suitable machine learning algorithms and their hyperparameters by automatically testing different machine learning algorithms on given data. Our proposed methods are used in choosing the appropriate learning algorithm and hyperparameters to build the model of the flow rate of the gas pipeline. Based on this, the model can be further used for control of the gas pipeline system. The experiments conducted on real industrial data show the feasibility of building accurate models with machine learning algorithms. The merits of our approach include (1) little dependence on the expertise of special fields and domain knowledge-based analysis; (2) easy to implement than physical models; (3) more robust to environment changes; (4) requiring much fewer computation resources when it is compared with physical models that call for complex equation solving. Moreover, our experiments also show that some simple yet powerful learning algorithms may outperform industrial control problems than those complex algorithms.


2019 ◽  
Author(s):  
Yunchao xie ◽  
Chen Zhang ◽  
Xiangquan Hu ◽  
Chi Zhang ◽  
Steven P. Kelley ◽  
...  

<p>Herein, we report the successful discovery of a new hierarchical structure of metal-organic nanocapsules (MONCs) by integrating chemical intuition and machine learning algorithms. By training datasets from a set of both succeeded and failed experiments, we studied the crystallization <a>propensity </a>of metal-organic nanocapsules (MONCs). Among four machine learning models, XGB model affords the highest prediction accuracy of 91%. The derived chemical feature scores and chemical hypothesis from the XGB model assist to identify proper synthesis parameters showing superior performance to a well-trained chemist. This paper will shed light on the discovery of new crystalline inorganic-organic hybrid materials guided by machine learning algorithms.</p>


Water ◽  
2021 ◽  
Vol 13 (24) ◽  
pp. 3520
Author(s):  
Zhufeng Li ◽  
Haixing Liu ◽  
Chunbo Luo ◽  
Guangtao Fu

Urban flooding is a devastating natural hazard for cities around the world. Flood risk mapping is a key tool in flood management. However, it is computationally expensive to produce flood risk maps using hydrodynamic models. To this end, this paper investigates the use of machine learning for the assessment of surface water flood risks in urban areas. The factors that are considered in machine learning models include coordinates, elevation, slope gradient, imperviousness, land use, land cover, soil type, substrate, distance to river, distance to road, and normalized difference vegetation index. The machine learning models are tested using the case study of Exeter, UK. The performance of machine learning algorithms, including naïve Bayes, perceptron, artificial neural networks (ANNs), and convolutional neural networks (CNNs), is compared based on a spectrum of indicators, e.g., accuracy, F-beta score, and receiver operating characteristic curve. The results obtained from the case study show that the flood risk maps can be accurately generated by the machine learning models. The performance of models on the 30-year flood event is better than 100-year and 1000-year flood events. The CNNs and ANNs outperform the other machine learning algorithms tested. This study shows that machine learning can help provide rapid flood mapping, and contribute to urban flood risk assessment and management.


Considering the immense cost of air crashes, the study examines the causes of crashes of aircrafts based on reported findings for the crash. The dataset used for this study included data for all reported air crashes across the globe for the period from 1981 to 2019. The causes were classified into seven categories. Multiple machine learning algorithms were used to identify the best for predicting the likely cause of accident based on features available. The Machine Learning Models used are Auto Classifier, Tree-AS and XGBoost. Also the key predictors are identified for use by planners.


Sign in / Sign up

Export Citation Format

Share Document