Incorporating Glucose Variability into Glucose Forecasting Accuracy Assessment Using the New Glucose Variability Impact Index and the Prediction Consistency Index: An LSTM Case Example

2021 ◽  
pp. 193229682110426
Author(s):  
Clara Mosquera-Lopez ◽  
Peter G. Jacobs

Background: In this work, we developed glucose forecasting algorithms trained and evaluated on a large dataset of free-living people with type 1 diabetes (T1D) using closed-loop (CL) and sensor-augmented pump (SAP) therapies; and we demonstrate how glucose variability impacts accuracy. We introduce the glucose variability impact index (GVII) and the glucose prediction consistency index (GPCI) to assess the accuracy of prediction algorithms. Methods: A long-short-term-memory (LSTM) neural network was designed to predict glucose up to 60 minutes in the future using continuous glucose measurements and insulin data collected from 175 people with T1D (41,318 days) and evaluated on 75 people (11,333 days) from the Tidepool Big Data Donation Dataset. LSTM was compared with two naïve forecasting algorithms as well as Ridge linear regression and a random forest using root-mean-square error (RMSE). Parkes error grid quantified clinical accuracy. Regression analysis was used to derive the GVII and GPCI. Results: The LSTM had highest accuracy and best GVII and GPCI. RMSE for CL was 19.8 ± 3.2 and 33.2 ± 5.4 mg/dL for 30- and 60-minute prediction horizons, respectively. RMSE for SAP was 19.6 ± 3.8 and 33.1 ± 7.3 mg/dL for 30- and 60-minute prediction horizons, respectively; 99.6% and 97.6% of predictions were within zones A+B of the Parkes error grid at 30- and 60-minute prediction horizons, respectively. Glucose variability was strongly correlated with RMSE (R≥0.64, P < 0.001); GVII and GPCI demonstrated a means to compare algorithms across datasets with different glucose variability. Conclusions: The LSTM model was accurate on a large real-world free-living dataset. Glucose variability should be considered when assessing prediction accuracy using indices such as GVII and GPCI.

2020 ◽  
Author(s):  
Joseph Prinable ◽  
Peter Jones ◽  
David Boland ◽  
Alistair McEwan ◽  
Cindy Thamrin

BACKGROUND The ability to continuously monitor breathing metrics may have indications for general health as well as respiratory conditions such as asthma. However, few studies have focused on breathing due to a lack of available wearable technologies. OBJECTIVE Examine the performance of two machine learning algorithms in extracting breathing metrics from a finger-based pulse oximeter, which is amenable to long-term monitoring. METHODS Pulse oximetry data was collected from 11 healthy and 11 asthma subjects who breathed at a range of controlled respiratory rates. UNET and Long Short-Term memory (LSTM) algorithms were applied to the data, and results compared against breathing metrics derived from respiratory inductance plethysmography measured simultaneously as a reference. RESULTS The UNET vs LSTM model provided breathing metrics which were strongly correlated with those from the reference signal (all p<0.001, except for inspiratory:expiratory ratio). The following relative mean bias(95% confidence interval) were observed: inspiration time 1.89(-52.95, 56.74)% vs 1.30(-52.15, 54.74)%, expiration time -3.70(-55.21, 47.80)% vs -4.97(-56.84, 46.89)%, inspiratory:expiratory ratio -4.65(-87.18, 77.88)% vs -5.30(-87.07, 76.47)%, inter-breath intervals -2.39(-32.76, 27.97)% vs -3.16(-33.69, 27.36)%, and respiratory rate 2.99(-27.04 to 33.02)% vs 3.69(-27.17 to 34.56)%. CONCLUSIONS Both machine learning models show strongly correlation and good comparability with reference, with low bias though wide variability for deriving breathing metrics in asthma and health cohorts. Future efforts should focus on improvement of performance of these models, e.g. by increasing the size of the training dataset at the lower breathing rates. CLINICALTRIAL Sydney Local Health District Human Research Ethics Committee (#LNR\16\HAWKE99 ethics approval).


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7853
Author(s):  
Aleksej Logacjov ◽  
Kerstin Bach ◽  
Atle Kongsvold ◽  
Hilde Bremseth Bårdstu ◽  
Paul Jarle Mork

Existing accelerometer-based human activity recognition (HAR) benchmark datasets that were recorded during free living suffer from non-fixed sensor placement, the usage of only one sensor, and unreliable annotations. We make two contributions in this work. First, we present the publicly available Human Activity Recognition Trondheim dataset (HARTH). Twenty-two participants were recorded for 90 to 120 min during their regular working hours using two three-axial accelerometers, attached to the thigh and lower back, and a chest-mounted camera. Experts annotated the data independently using the camera’s video signal and achieved high inter-rater agreement (Fleiss’ Kappa =0.96). They labeled twelve activities. The second contribution of this paper is the training of seven different baseline machine learning models for HAR on our dataset. We used a support vector machine, k-nearest neighbor, random forest, extreme gradient boost, convolutional neural network, bidirectional long short-term memory, and convolutional neural network with multi-resolution blocks. The support vector machine achieved the best results with an F1-score of 0.81 (standard deviation: ±0.18), recall of 0.85±0.13, and precision of 0.79±0.22 in a leave-one-subject-out cross-validation. Our highly professional recordings and annotations provide a promising benchmark dataset for researchers to develop innovative machine learning approaches for precise HAR in free living.


Author(s):  
Henri Honka ◽  
Janet Chuang ◽  
David D’Alessio ◽  
Marzieh Salehi

Abstract Context Gastric bypass (GB) increases postprandial glucose excursion, which in turn can predispose to the late complication of hypoglycemia. Diagnosis remains challenging and requires documentation of symptoms associated with low glucose, and relief of symptom when glucose is normalized (Whipple’s triad). Objective To compare the yield of mixed meal test (MMT) and continuous glucose monitoring system (CGMS) in detecting hypoglycemia after gastric bypass surgery (GB). Setting The study was conducted at General Clinical Research Unit, Cincinnati Children’s Hospital (Cincinnati, OH, United States). Methods Glucose profiles were evaluated in 15 patients with documented recurrent clinical hypoglycemia after GB, 8 matched asymptomatic GB subjects, and 9 healthy weight-matched non-operated controls using MMT in a control setting and CGMS under free-living conditions. Results Patients with prior GB had larger glucose variability during both MMT and CGMS when compared to non-surgical controls regardless of their hypoglycemic status. Sensitivity (71 vs. 47 %) and specificity (100 vs. 88 %) of MMT in detecting hypoglycemia was superior to CGMS. Conclusions Our findings indicate that a fixed carbohydrate ingestion during MMT is a more reliable test to diagnose GB-related hypoglycemia compared to CGMS during free-living state.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7195
Author(s):  
Yashi Nan ◽  
Nigel H. Lovell ◽  
Stephen J. Redmond ◽  
Kejia Wang ◽  
Kim Delbaere ◽  
...  

Activity recognition can provide useful information about an older individual’s activity level and encourage older people to become more active to live longer in good health. This study aimed to develop an activity recognition algorithm for smartphone accelerometry data of older people. Deep learning algorithms, including convolutional neural network (CNN) and long short-term memory (LSTM), were evaluated in this study. Smartphone accelerometry data of free-living activities, performed by 53 older people (83.8 ± 3.8 years; 38 male) under standardized circumstances, were classified into lying, sitting, standing, transition, walking, walking upstairs, and walking downstairs. A 1D CNN, a multichannel CNN, a CNN-LSTM, and a multichannel CNN-LSTM model were tested. The models were compared on accuracy and computational efficiency. Results show that the multichannel CNN-LSTM model achieved the best classification results, with an 81.1% accuracy and an acceptable model and time complexity. Specifically, the accuracy was 67.0% for lying, 70.7% for sitting, 88.4% for standing, 78.2% for transitions, 88.7% for walking, 65.7% for walking downstairs, and 68.7% for walking upstairs. The findings indicated that the multichannel CNN-LSTM model was feasible for smartphone-based activity recognition in older people.


2020 ◽  
Vol 15 (1) ◽  
pp. 76-81 ◽  
Author(s):  
Michael Stedman ◽  
Rustam Rea ◽  
Christopher J. Duff ◽  
Mark Livingston ◽  
Gabriela Moreno ◽  
...  

Background: The National Health Service spends £170 million on blood glucose monitoring (BGM) strips each year and there are pressures to use cheaper less accurate strips. Technology is also being used to increase test frequency with less focus on accuracy. Previous modeling/real-world data analysis highlighted that actual blood glucose variability can be more than twice blood glucose meter reported variability (BGMV). We applied those results to the Parkes error grid to highlight potential clinical impact. Method: BGMV is defined as the percent of deviation from reference that contains 95% of results. Four categories were modeled: laboratory (<5%), high accuracy strips (<10%), ISO 2013 (<15%), and ISO 2003 (<20%) (includes some strips still used). The Parkes error grid model with its associated category of risk including “alter clinical decision” and “affect clinical outcomes” was used, with the profile of frequency of expected results fitted into each BGM accuracy category. Results: Applying to single readings, almost all strip accuracy ranges derived in a controlled setting fell within the category: clinically accurate/no effect on outcomes areas. However modeling the possible blood glucose distribution in more detail, 30.6% of longer term results of the strips with current ISO accuracy would fall into the “alter clinical action” category. For previous ISO strips, this rose to 44.1%, and for the latest higher accuracy strips, this fell to 12.8%. Conclusion: There is a minimum standard of accuracy needed to ensure that clinical outcomes are not put at risk. This study highlights the potential for amplification of imprecision with less accurate BGM strips.


2020 ◽  
Vol 2020 ◽  
pp. 1-7 ◽  
Author(s):  
Hui Li ◽  
Jinjin Hua ◽  
Jinqiu Li ◽  
Geng Li

This paper analyzed the development of data mining and the development of the fifth generation (5G) for the Internet of Things (IoT) and uses a deep learning method for stock forecasting. In order to solve the problems such as low accuracy and training complexity caused by complicated data in stock model forecasting, we proposed a forecasting method based on the feature selection (FS) and Long Short-Term Memory (LSTM) algorithm to predict the closing price of stock. Considering its future potential application, this paper takes 4 stock data from the Shenzhen Component Index as an example and constructs the feature set for prediction based on 17 technical indexes which are commonly used in stock market. The optimal feature set is decided via FS to reduce the dimension of data and the training complexity. The LSTM algorithm is used to forecast closing price of stock. The empirical results show that compared with the LSTM model, the FS-LSTM combination model improves the accuracy of prediction and reduces the error between the real value and the forecast value in stock price prediction.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 215
Author(s):  
Quanzeng Wang ◽  
Yangling Zhou ◽  
Pejman Ghassemi ◽  
David McBride ◽  
Jon P. Casamento ◽  
...  

Infrared thermographs (IRTs) implemented according to standardized best practices have shown strong potential for detecting elevated body temperatures (EBT), which may be useful in clinical settings and during infectious disease epidemics. However, optimal IRT calibration methods have not been established and the clinical performance of these devices relative to the more common non-contact infrared thermometers (NCITs) remains unclear. In addition to confirming the findings of our preliminary analysis of clinical study results, the primary intent of this study was to compare methods for IRT calibration and identify best practices for assessing the performance of IRTs intended to detect EBT. A key secondary aim was to compare IRT clinical accuracy to that of NCITs. We performed a clinical thermographic imaging study of more than 1000 subjects, acquiring temperature data from several facial locations that, along with reference oral temperatures, were used to calibrate two IRT systems based on seven different regression methods. Oral temperatures imputed from facial data were used to evaluate IRT clinical accuracy based on metrics such as clinical bias (Δcb), repeatability, root-mean-square difference, and sensitivity/specificity. We proposed several calibration approaches designed to account for the non-uniform data density across the temperature range and a constant offset approach tended to show better ability to detect EBT. As in our prior study, inner canthi or full-face maximum temperatures provided the highest clinical accuracy. With an optimal calibration approach, these methods achieved a Δcb between ±0.03 °C with standard deviation (σΔcb) less than 0.3 °C, and sensitivity/specificity between 84% and 94%. Results of forehead-center measurements with NCITs or IRTs indicated reduced performance. An analysis of the complete clinical data set confirms the essential findings of our preliminary evaluation, with minor differences. Our findings provide novel insights into methods and metrics for the clinical accuracy assessment of IRTs. Furthermore, our results indicate that calibration approaches providing the highest clinical accuracy in the 37–38.5 °C range may be most effective for measuring EBT. While device performance depends on many factors, IRTs can provide superior performance to NCITs.


1953 ◽  
Vol 4 (2) ◽  
pp. 204 ◽  
Author(s):  
FHW Morley

A study of published data on the fold scores of certain breed crosses, backcrosses, and filial generations suggests that causes of variation in skinfold score act geometrically. A logarithmic transformation increased the accuracy of prediction of F2 and backcross scores. Data from selection experiments on Merinos at Trangie were analysed using both arithmetic and logarithmic scales. Heritability of breech fold score was estimated as 0.45 on an arithmetic scale, 0.55 on the logarithmic scale. The mean and variance within groups of Merinos with different means were strongly correlated on the arithmetic scale, but this correlation was removed by the logarithmic transformation, resulting variances being approximately constant. Freedom from folds showed strong potence on both arithmetic and logarithmic scales. Theoretical implications of potence and geometric action appear to be confirmed by available data.


1997 ◽  
Vol 84 (1) ◽  
pp. 99-105 ◽  
Author(s):  
Erhan Nalçaci ◽  
Metehan Çiçek ◽  
Canan Kalaycioglu ◽  
Sema Yavuzer

The effect of sex on the phenomenon of pseudoneglect was assessed in 60 male and 61 female right-handed subjects using a modified form of Corsi's block-tapping test. A significant right-lateralized pseudoneglect for both sexes was found, and the level of pseudoneglect strongly correlated with neglect in the right hemispace. Men were significantly more accurate in the left hemispace than women, whereas no difference was seen between the sexes in the right hemispace. Although we found some indirect evidence from which to infer that the men's brain may be functionally more lateralized than the women's for this spatial task, there was no significant difference between the sexes in correct responses for the left hemispace, i.e., right pseudoneglect. Therefore, the results suggest that pseudoneglect phenomenon can be partly explained by a functional asymmetric feature of the brain, and the other factors probably play a role in producing the similar patterns of asymmetric perception of space in males and females.


Sign in / Sign up

Export Citation Format

Share Document