Derivation of Respiratory Metrics in Health and Asthma; Machine Learning Methodology (Preprint)

2020 ◽  
Author(s):  
Joseph Prinable ◽  
Peter Jones ◽  
David Boland ◽  
Alistair McEwan ◽  
Cindy Thamrin

BACKGROUND The ability to continuously monitor breathing metrics may have indications for general health as well as respiratory conditions such as asthma. However, few studies have focused on breathing due to a lack of available wearable technologies. OBJECTIVE Examine the performance of two machine learning algorithms in extracting breathing metrics from a finger-based pulse oximeter, which is amenable to long-term monitoring. METHODS Pulse oximetry data was collected from 11 healthy and 11 asthma subjects who breathed at a range of controlled respiratory rates. UNET and Long Short-Term memory (LSTM) algorithms were applied to the data, and results compared against breathing metrics derived from respiratory inductance plethysmography measured simultaneously as a reference. RESULTS The UNET vs LSTM model provided breathing metrics which were strongly correlated with those from the reference signal (all p<0.001, except for inspiratory:expiratory ratio). The following relative mean bias(95% confidence interval) were observed: inspiration time 1.89(-52.95, 56.74)% vs 1.30(-52.15, 54.74)%, expiration time -3.70(-55.21, 47.80)% vs -4.97(-56.84, 46.89)%, inspiratory:expiratory ratio -4.65(-87.18, 77.88)% vs -5.30(-87.07, 76.47)%, inter-breath intervals -2.39(-32.76, 27.97)% vs -3.16(-33.69, 27.36)%, and respiratory rate 2.99(-27.04 to 33.02)% vs 3.69(-27.17 to 34.56)%. CONCLUSIONS Both machine learning models show strongly correlation and good comparability with reference, with low bias though wide variability for deriving breathing metrics in asthma and health cohorts. Future efforts should focus on improvement of performance of these models, e.g. by increasing the size of the training dataset at the lower breathing rates. CLINICALTRIAL Sydney Local Health District Human Research Ethics Committee (#LNR\16\HAWKE99 ethics approval).

Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7134
Author(s):  
Joseph Prinable ◽  
Peter Jones ◽  
David Boland ◽  
Alistair McEwan ◽  
Cindy Thamrin

The ability to continuously monitor breathing metrics may have indications for general health as well as respiratory conditions such as asthma. However, few studies have focused on breathing due to a lack of available wearable technologies. To examine the performance of two machine learning algorithms in extracting breathing metrics from a finger-based pulse oximeter, which is amenable to long-term monitoring. Methods: Pulse oximetry data were collected from 11 healthy and 11 with asthma subjects who breathed at a range of controlled respiratory rates. U-shaped network (U-Net) and Long Short-Term Memory (LSTM) algorithms were applied to the data, and results compared against breathing metrics derived from respiratory inductance plethysmography measured simultaneously as a reference. Results: The LSTM vs. U-Net model provided breathing metrics which were strongly correlated with those from the reference signal (all p < 0.001, except for inspiratory: expiratory ratio). The following absolute mean bias (95% confidence interval) values were observed (in seconds): inspiration time 0.01(−2.31, 2.34) vs. −0.02(−2.19, 2.16), expiration time −0.19(−2.35, 1.98) vs. −0.24(−2.36, 1.89), and inter-breath intervals −0.19(−2.73, 2.35) vs. −0.25(2.76, 2.26). The inspiratory:expiratory ratios were −0.14(−1.43, 1.16) vs. −0.14(−1.42, 1.13). Respiratory rate (breaths per minute) values were 0.22(−2.51, 2.96) vs. 0.29(−2.54, 3.11). While percentage bias was low, the 95% limits of agreement was high (~35% for respiratory rate). Conclusion: Both machine learning models show strong correlation and good comparability with reference, with low bias though wide variability for deriving breathing metrics in asthma and health cohorts. Future efforts should focus on improvement of performance of these models, e.g., by increasing the size of the training dataset at the lower breathing rates.


2019 ◽  
Author(s):  
Joseph Prinable ◽  
Peter Jones ◽  
David Boland ◽  
Cindy Thamrin ◽  
Alistair McEwan

BACKGROUND There has been a recent increased interest in monitoring health using wearable sensor technologies; however, few have focused on breathing. The ability to monitor breathing metrics may have indications both for general health as well as respiratory conditions such as asthma, where long-term monitoring of lung function has shown promising utility. OBJECTIVE In this paper, we explore a long short-term memory (LSTM) architecture and predict measures of interbreath intervals, respiratory rate, and the inspiration-expiration ratio from a photoplethysmogram signal. This serves as a proof-of-concept study of the applicability of a machine learning architecture to the derivation of respiratory metrics. METHODS A pulse oximeter was mounted to the left index finger of 9 healthy subjects who breathed at controlled respiratory rates. A respiratory band was used to collect a reference signal as a comparison. RESULTS Over a 40-second window, the LSTM model predicted a respiratory waveform through which breathing metrics could be derived with a bias value and 95% CI. Metrics included inspiration time (–0.16 seconds, –1.64 to 1.31 seconds), expiration time (0.09 seconds, –1.35 to 1.53 seconds), respiratory rate (0.12 breaths per minute, –2.13 to 2.37 breaths per minute), interbreath intervals (–0.07 seconds, –1.75 to 1.61 seconds), and the inspiration-expiration ratio (0.09, –0.66 to 0.84). CONCLUSIONS A trained LSTM model shows acceptable accuracy for deriving breathing metrics and could be useful for long-term breathing monitoring in health. Its utility in respiratory disease (eg, asthma) warrants further investigation.


10.2196/13737 ◽  
2020 ◽  
Vol 8 (7) ◽  
pp. e13737 ◽  
Author(s):  
Joseph Prinable ◽  
Peter Jones ◽  
David Boland ◽  
Cindy Thamrin ◽  
Alistair McEwan

Background There has been a recent increased interest in monitoring health using wearable sensor technologies; however, few have focused on breathing. The ability to monitor breathing metrics may have indications both for general health as well as respiratory conditions such as asthma, where long-term monitoring of lung function has shown promising utility. Objective In this paper, we explore a long short-term memory (LSTM) architecture and predict measures of interbreath intervals, respiratory rate, and the inspiration-expiration ratio from a photoplethysmogram signal. This serves as a proof-of-concept study of the applicability of a machine learning architecture to the derivation of respiratory metrics. Methods A pulse oximeter was mounted to the left index finger of 9 healthy subjects who breathed at controlled respiratory rates. A respiratory band was used to collect a reference signal as a comparison. Results Over a 40-second window, the LSTM model predicted a respiratory waveform through which breathing metrics could be derived with a bias value and 95% CI. Metrics included inspiration time (–0.16 seconds, –1.64 to 1.31 seconds), expiration time (0.09 seconds, –1.35 to 1.53 seconds), respiratory rate (0.12 breaths per minute, –2.13 to 2.37 breaths per minute), interbreath intervals (–0.07 seconds, –1.75 to 1.61 seconds), and the inspiration-expiration ratio (0.09, –0.66 to 0.84). Conclusions A trained LSTM model shows acceptable accuracy for deriving breathing metrics and could be useful for long-term breathing monitoring in health. Its utility in respiratory disease (eg, asthma) warrants further investigation.


2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


2021 ◽  
Vol 186 (Supplement_1) ◽  
pp. 445-451
Author(s):  
Yifei Sun ◽  
Navid Rashedi ◽  
Vikrant Vaze ◽  
Parikshit Shah ◽  
Ryan Halter ◽  
...  

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.


2020 ◽  
Vol 13 (1) ◽  
pp. 10
Author(s):  
Andrea Sulova ◽  
Jamal Jokar Arsanjani

Recent studies have suggested that due to climate change, the number of wildfires across the globe have been increasing and continue to grow even more. The recent massive wildfires, which hit Australia during the 2019–2020 summer season, raised questions to what extent the risk of wildfires can be linked to various climate, environmental, topographical, and social factors and how to predict fire occurrences to take preventive measures. Hence, the main objective of this study was to develop an automatized and cloud-based workflow for generating a training dataset of fire events at a continental level using freely available remote sensing data with a reasonable computational expense for injecting into machine learning models. As a result, a data-driven model was set up in Google Earth Engine platform, which is publicly accessible and open for further adjustments. The training dataset was applied to different machine learning algorithms, i.e., Random Forest, Naïve Bayes, and Classification and Regression Tree. The findings show that Random Forest outperformed other algorithms and hence it was used further to explore the driving factors using variable importance analysis. The study indicates the probability of fire occurrences across Australia as well as identifies the potential driving factors of Australian wildfires for the 2019–2020 summer season. The methodical approach and achieved results and drawn conclusions can be of great importance to policymakers, environmentalists, and climate change researchers, among others.


2021 ◽  
Vol 11 (10) ◽  
pp. 4443
Author(s):  
Rokas Štrimaitis ◽  
Pavel Stefanovič ◽  
Simona Ramanauskaitė ◽  
Asta Slotkienė

Financial area analysis is not limited to enterprise performance analysis. It is worth analyzing as wide an area as possible to obtain the full impression of a specific enterprise. News website content is a datum source that expresses the public’s opinion on enterprise operations, status, etc. Therefore, it is worth analyzing the news portal article text. Sentiment analysis in English texts and financial area texts exist, and are accurate, the complexity of Lithuanian language is mostly concentrated on sentiment analysis of comment texts, and does not provide high accuracy. Therefore in this paper, the supervised machine learning model was implemented to assign sentiment analysis on financial context news, gathered from Lithuanian language websites. The analysis was made using three commonly used classification algorithms in the field of sentiment analysis. The hyperparameters optimization using the grid search was performed to discover the best parameters of each classifier. All experimental investigations were made using the newly collected datasets from four Lithuanian news websites. The results of the applied machine learning algorithms show that the highest accuracy is obtained using a non-balanced dataset, via the multinomial Naive Bayes algorithm (71.1%). The other algorithm accuracies were slightly lower: a long short-term memory (71%), and a support vector machine (70.4%).


2015 ◽  
Vol 32 (6) ◽  
pp. 821-827 ◽  
Author(s):  
Enrique Audain ◽  
Yassel Ramos ◽  
Henning Hermjakob ◽  
Darren R. Flower ◽  
Yasset Perez-Riverol

Abstract Motivation: In any macromolecular polyprotic system—for example protein, DNA or RNA—the isoelectric point—commonly referred to as the pI—can be defined as the point of singularity in a titration curve, corresponding to the solution pH value at which the net overall surface charge—and thus the electrophoretic mobility—of the ampholyte sums to zero. Different modern analytical biochemistry and proteomics methods depend on the isoelectric point as a principal feature for protein and peptide characterization. Protein separation by isoelectric point is a critical part of 2-D gel electrophoresis, a key precursor of proteomics, where discrete spots can be digested in-gel, and proteins subsequently identified by analytical mass spectrometry. Peptide fractionation according to their pI is also widely used in current proteomics sample preparation procedures previous to the LC-MS/MS analysis. Therefore accurate theoretical prediction of pI would expedite such analysis. While such pI calculation is widely used, it remains largely untested, motivating our efforts to benchmark pI prediction methods. Results: Using data from the database PIP-DB and one publically available dataset as our reference gold standard, we have undertaken the benchmarking of pI calculation methods. We find that methods vary in their accuracy and are highly sensitive to the choice of basis set. The machine-learning algorithms, especially the SVM-based algorithm, showed a superior performance when studying peptide mixtures. In general, learning-based pI prediction methods (such as Cofactor, SVM and Branca) require a large training dataset and their resulting performance will strongly depend of the quality of that data. In contrast with Iterative methods, machine-learning algorithms have the advantage of being able to add new features to improve the accuracy of prediction. Contact: [email protected] Availability and Implementation: The software and data are freely available at https://github.com/ypriverol/pIR. Supplementary information: Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Myeong Gyu Kim ◽  
Jae Hyun Kim ◽  
Kyungim Kim

BACKGROUND Garlic-related misinformation is prevalent whenever a virus outbreak occurs. Again, with the outbreak of coronavirus disease 2019 (COVID-19), garlic-related misinformation is spreading through social media sites, including Twitter. Machine learning-based approaches can be used to detect misinformation from vast tweets. OBJECTIVE This study aimed to develop machine learning algorithms for detecting misinformation on garlic and COVID-19 in Twitter. METHODS This study used 5,929 original tweets mentioning garlic and COVID-19. Tweets were manually labeled as misinformation, accurate information, and others. We tested the following algorithms: k-nearest neighbors; random forest; support vector machine (SVM) with linear, radial, and polynomial kernels; and neural network. Features for machine learning included user-based features (verified account, user type, number of followers, and follower rate) and text-based features (uniform resource locator, negation, sentiment score, Latent Dirichlet Allocation topic probability, number of retweets, and number of favorites). A model with the highest accuracy in the training dataset (70% of overall dataset) was tested using a test dataset (30% of overall dataset). Predictive performance was measured using overall accuracy, sensitivity, specificity, and balanced accuracy. RESULTS SVM with the polynomial kernel model showed the highest accuracy of 0.670. The model also showed a balanced accuracy of 0.757, sensitivity of 0.819, and specificity of 0.696 for misinformation. Important features in the misinformation and accurate information classes included topic 4 (common myths), topic 13 (garlic-specific myths), number of followers, topic 11 (misinformation on social media), and follower rate. Topic 3 (cooking recipes) was the most important feature in the others class. CONCLUSIONS Our SVM model showed good performance in detecting misinformation. The results of our study will help detect misinformation related to garlic and COVID-19. It could also be applied to prevent misinformation related to dietary supplements in the event of a future outbreak of a disease other than COVID-19.


2021 ◽  
Author(s):  
Lianteng Song ◽  
◽  
Zhonghua Liu ◽  
Chaoliu Li ◽  
Congqian Ning ◽  
...  

Geomechanical properties are essential for safe drilling, successful completion, and exploration of both conven-tional and unconventional reservoirs, e.g. deep shale gas and shale oil. Typically, these properties could be calcu-lated from sonic logs. However, in shale reservoirs, it is time-consuming and challenging to obtain reliable log-ging data due to borehole complexity and lacking of in-formation, which often results in log deficiency and high recovery cost of incomplete datasets. In this work, we propose the bidirectional long short-term memory (BiL-STM) which is a supervised neural network algorithm that has been widely used in sequential data-based pre-diction to estimate geomechanical parameters. The pre-diction from log data can be conducted from two differ-ent aspects. 1) Single-Well prediction, the log data from a single well is divided into training data and testing data for cross validation; 2) Cross-Well prediction, a group of wells from the same geographical region are divided into training set and testing set for cross validation, as well. The logs used in this work were collected from 11 wells from Jimusaer Shale, which includes gamma ray, bulk density, resistivity, and etc. We employed 5 vari-ous machine learning algorithms for comparison, among which BiLSTM showed the best performance with an R-squared of more than 90% and an RMSE of less than 10. The predicted results can be directly used to calcu-late geomechanical properties, of which accuracy is also improved in contrast to conventional methods.


Sign in / Sign up

Export Citation Format

Share Document