398 Agreement and reliability of a new respiratory event and arousal detection algorithm against multiple human scorers

SLEEP ◽  
2021 ◽  
Vol 44 (Supplement_2) ◽  
pp. A158-A158
Author(s):  
Ulysses Magalang ◽  
Brendan Keenan ◽  
Bethany Staley ◽  
Marco Ross ◽  
Peter Anderer ◽  
...  

Abstract Introduction Scoring algorithms have the potential to increase polysomnography (PSG) scoring efficiency while also ensuring consistency and reproducibility. We sought to validate an updated event detection algorithm (Somnolyzer; Philips, Monroeville PA USA) against manual scoring, by analyzing a dataset we have previously used to report scoring variability across nine center-members of the Sleep Apnea Global Interdisciplinary Consortium (SAGIC). Methods Fifteen PSGs collected at a single sleep clinic were scored independently by technologists at nine SAGIC centers located in six countries, and auto-scored with the algorithm. Arousals, apneas, and hypopneas were identified according to the American Academy of Sleep Medicine recommended criteria. We calculated the intraclass correlation coefficient (ICC) and performed a Bland-Altman analysis comparing the average manual- and auto-scored apnea-hypopnea index (AHI), arousal index (ArI), apneas, obstructive apneas, central apneas, mixed apneas, and hypopneas. We hypothesized that the values from auto-scoring would show good agreement and reliability when compared to the average across manual scorers. Results Participants contributing to the original dataset had a mean (SD) age of 47 (12) years, AHI of 24.7 (18.2) events/hour, and 80% were male. The ICCs (95% confidence interval) between average manual- and auto-scoring were almost perfect (ICC=0.80–1.00) for AHI [0.989 (0.968, 0.996)], ArI [0.897 (0.729, 0.964)], hypopneas [0.992 (0.978, 0.997)], total apneas [0.973 (0.924, 0.991)], and obstructive apneas [0.919 (0.781, 0.972)], and moderately reliable (ICC=0.40–0.60] for central [0.537 (0.069, 0.815)] and mixed [0.502 (0.021, 0.798)] apneas. Similarly, Bland-Altman analyses supported good agreement for event detection between techniques, with a mean difference (limits of agreement) of only 1.45 (-3.22, 6.12) events/hour for AHI, total apneas 5.2 (-23.9, 34.3), obstructive apneas 1.8 (-45.9, 49.5), central apneas 1.8 (-9.7, 13.4), mixed apneas 1.6 (-14.8, 17.9), and hypopneas 4.3 (-12.4, 20.9). Conclusion Results support almost perfect reliability between auto-scoring and manual scoring of AHI, ArI, hypopneas, total apneas, and obstructive apneas, as well as moderate reliability for central and mixed apneas. There was good agreement between methods, with small mean differences; wider limits of agreement for specific type of apneas did not affect accuracy of the overall AHI. Thus, the auto-scoring algorithm appears reliable for event detection. Support (if any) Philips

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Domenico Schiano-Lomoriello ◽  
Kenneth J. Hoffer ◽  
Irene Abicca ◽  
Giacomo Savini

AbstractWe assess repeatability of automatic measurements of a new anterior segment optical coherence tomographer and biometer (ANTERION) and their agreement with those provided by an anterior segment-optical coherence tomography device combined with Placido-disk corneal topography (MS-39) and a validated optical biometer (IOLMaster 500). A consecutive series of patients underwent three measurements with ANTERION and one with MS-39. A subgroup of patients underwent biometry also with IOLMaster 500. Repeatability was assessed by means of within-subject standard deviation, coefficient of variation (COV), and intraclass correlation coefficient (ICC). Agreement was investigated with the 95% limits of agreement. Paired t-test and Wilcoxon matched-pairs test were performed to compare the measurements of the different devices. Repeatability of ANTERION measurements was high, with ICC > 0.98 for all parameters except astigmatism (0.963); all parameters apart from those related to astigmatism revealed a COV < 1%. Repeatability of astigmatism improved when only eyes whose keratometric astigmatism was higher than 1.0 D were investigated. Most measurements by ANTERION and MS-39 showed good agreement. No significant differences were found between measurements by ANTERION and IOLMaster, but for corneal diameter. ANTERION revealed high repeatability of automatic measurements and good agreement with both MS-39 and IOLMaster for most parameters.


SLEEP ◽  
2021 ◽  
Vol 44 (Supplement_2) ◽  
pp. A101-A101
Author(s):  
Ulysses Magalang ◽  
Brendan Keenan ◽  
Bethany Staley ◽  
Peter Anderer ◽  
Marco Ross ◽  
...  

Abstract Introduction Scoring algorithms have the potential to increase polysomnography (PSG) scoring efficiency while also ensuring consistency and reproducibility. We sought to validate an updated sleep staging algorithm (Somnolyzer; Philips, Monroeville PA USA) against manual sleep staging, by analyzing a dataset we have previously used to report sleep staging variability across nine center-members of the Sleep Apnea Global Interdisciplinary Consortium (SAGIC). Methods Fifteen PSGs collected at a single sleep clinic were scored independently by technologists at nine SAGIC centers located in six countries, and auto-scored with the algorithm. Each 30-second epoch was staged manually according to American Academy of Sleep Medicine criteria. We calculated the intraclass correlation coefficient (ICC) and performed a Bland-Altman analysis comparing the average manual- and auto-scored total sleep time (TST) and time in each sleep stage (N1, N2, N3, rapid eye movement [REM]). We hypothesized that the values from auto-scoring would show good agreement and reliability when compared to the average across manual scorers. Results The participants contributing to the original dataset had a mean (SD) age of 47 (12) years and 80% were male. Auto-scoring showed substantial (ICC=0.60-0.80) or almost perfect (ICC=0.80-1.00) reliability compared to manual-scoring average, with ICCs (95% confidence interval) of 0.976 (0.931, 0.992) for TST, 0.681 (0.291, 0.879) for time in N1, 0.685 (0.299, 0.881) for time in N2, 0.922 (0.791, 0.973) for time in N3, and 0.930 (0.811, 0.976) for time in REM. Similarly, Bland-Altman analyses showed good agreement between methods, with a mean difference (limits of agreement) of only 1.2 (-19.7, 22.0) minutes for TST, 13.0 (-18.2, 44.1) minutes for N1, -13.8 (-65.7, 38.1) minutes for N2, -0.33 (-26.1, 25.5) minutes for N3, and -1.2 (-25.9, 23.5) minutes for REM. Conclusion Results support high reliability and good agreement between the auto-scoring algorithm and average human scoring for measurements of sleep durations. Auto-scoring slightly overestimated N1 and underestimated N2, but results for TST, N3 and REM were nearly identical on average. Thus, the auto-scoring algorithm is acceptable for sleep staging when compared against human scorers. Support (if any) Philips.


2018 ◽  
Vol 21 (6) ◽  
pp. 1036-1042 ◽  
Author(s):  
Xuejun Yin ◽  
Bruce Neal ◽  
Maoyi Tian ◽  
Zhifang Li ◽  
Kristina Petersen ◽  
...  

AbstractObjectiveMeasurement of mean population Na and K intakes typically uses laboratory-based assays, which can add significant logistical burden and costs. A valid field-based measurement method would be a significant advance. In the current study, we used 24 h urine samples to compare estimates of Na, K and Na:K ratio based upon assays done using the field-based Horiba twin meter v. laboratory-based methods.DesignThe performance of the Horiba twin meter was determined by comparing field-based estimates of mean Na and K against those obtained using laboratory-based methods. The reported 95 % limits of agreement of Bland–Altman plots were calculated based on a regression approach for non-uniform differences.SettingThe 24 h urine samples were collected as part of an ongoing study being done in rural China.SubjectsOne hundred and sixty-six complete 24 h urine samples were qualified for estimating 24 h urinary Na and K excretion.ResultsMean Na and K excretion were estimated as 170·4 and 37·4 mmol/d, respectively, using the meter-based assays; and 193·4 and 43·8 mmol/d, respectively, using the laboratory-based assays. There was excellent relative reliability (intraclass correlation coefficient) for both Na (0·986) and K (0·986). Bland–Altman plots showed moderate-to-good agreement between the two methods.ConclusionsNa and K intake estimations were moderately underestimated using assays based upon the Horiba twin meter. Compared with standard laboratory-based methods, the portable device was more practical and convenient.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
René F. Castien ◽  
Michel W. Coppieters ◽  
Tom S. C. Durge ◽  
Gwendolyne G. M. Scholten-Peeters

Abstract Background Pressure pain thresholds (PPTs) are commonly assessed to quantify mechanical sensitivity in various conditions, including migraine. Digital and analogue algometers are used, but the concurrent validity between these algometers is unknown. Therefore, we assessed the concurrent validity between a digital and analogue algometer to determine PPTs in healthy participants and people with migraine. Methods Twenty-six healthy participants and twenty-nine people with migraine participated in the study. PPTs were measured interictally and bilaterally at the cephalic region (temporal muscle, C1 paraspinal muscles, and trapezius muscle) and extra-cephalic region (extensor carpi radialis muscle and tibialis anterior muscle). PPTs were first determined with a digital algometer, followed by an analogue algometer. Intraclass correlation coefficients (ICC3.1) and limits of agreement were calculated to quantify concurrent validity. Results The concurrent validity between algometers in both groups was moderate to excellent (ICC3.1 ranged from 0.82 to 0.99, with 95%CI: 0.65 to 0.99). Although PPTs measured with the analogue algometer were higher at most locations in both groups (p < 0.05), the mean differences between both devices were less than 18.3 kPa. The variation in methods, such as a hand-held switch (digital algometer) versus verbal commands (analogue algometer) to indicate when the threshold was reached, may explain these differences in scores. The limits of agreement varied per location and between healthy participants and people with migraine. Conclusion The concurrent validity between the digital and analogue algometer is excellent in healthy participants and moderate in people with migraine. Both types of algometer are well-suited for research and clinical practice but are not exchangeable within a study or patient follow-up.


2020 ◽  
Vol 33 (7) ◽  
pp. 845-852
Author(s):  
Theresa Herttrich ◽  
Johann Daxer ◽  
Andreas Hiemisch ◽  
Jens Kluge ◽  
Andreas Merkenschlager ◽  
...  

AbstractBackgroundAccumulating evidence suggests a relationship between sleep alterations and overweight/obesity in children. Our aim was to investigate the association of sleep measures other than obstructive sleep apnea or sleep duration with overweight/obesity and metabolic function in children.MethodsWe conducted a prospective cohort study in school- aged children (aged 5 to 8 years, prepubertal, and 12 to 15 years, pubertal) with overweight/obesity and normal-weight children. All children underwent a standardized in-laboratory polysomnography followed by a fasting blood assessment for glucose and metabolic testing. Subjective sleep measures were investigated by a 7-day sleep diary and questionnaire. We analyzed prepubertal and pubertal groups separately using logistic regression and partial correlation analyses.ResultsA total of 151 participants were analyzed. Overweight/obese children had significantly higher odds for arousal index (prepubertal children: 1.28, Confidence interval (CI): 1.06, 1.67; pubertal children: 1.65, CI: 1.19, 2.29) than normal-weight children, independent of age and gender. In prepubertal children, arousal-index was positively associated with C-peptide (r=0.30, p=0.01), whereas Minimum O2 saturation was negatively associated with triglycerides (r=−0.34, p=0.005), adjusting for age and sex. However, associations were attenuated by further adjustment for body mass index standard deviation scores (BMI-SDS). In pubertal children, higher level of apnea-hypopnea-index and pCO2 predicted increased lipoprotein (a) levels (r=0.35, p=0.03 and r=0.40, p=0.01, respectively), independent of age, sex, and BMI-SDS. A negative association was found between pCO2 and high-density lipoprotein (HDL)-cholesterol (r=−0.40, p=0.01).ConclusionsOverall, we report that sleep quality as measured by arousal index may be compromised by overweight and obesity in children and warrants attention in future intervention programs.


Author(s):  
Daniel Rojas-Valverde ◽  
José Pino-Ortega ◽  
Rafael Timón ◽  
Randall Gutiérrez-Vargas ◽  
Braulio Sánchez-Ureña ◽  
...  

The extensive use of wearable sensors in sport medicine, exercise medicine, and health has increased the interest in their study. That is why it is necessary to test these technologies’ efficiency, effectiveness, agreement, and reliability in different settings. Consequently, the purpose of this article was to analyze the magnetic, angular rate, and gravity (MARG) sensor’s test-retest agreement and reliability when assessing multiple body segments’ external loads during off-road running. A total of 18 off-road runners (38.78 ± 10.38 years, 73.24 ± 12.6 kg, 172.17 ± 9.48 cm) ran two laps (1st and 2nd Lap) of a 12 km circuit wearing six MARG sensors. The sensors were attached to six different body segments: left (MPLeft) and right (MPRight) malleolus peroneus, left (VLLeft) and right (VLRight) vastus lateralis, lumbar (L1-L3), and thorax (T2-T4) using a special neoprene suit. After a principal component analysis (PCA) was performed, the total data set variance of all body segments was represented by 44.08%–70.64% for the 1st PCA factor considering two variables, Player LoadRT and Impacts, on L1-L3, respectively. These two variables were chosen among three total accelerometry-based external load indicators (ABELIs) to perform the agreement and reliability tests due to their relevance based on PCAs for each body segment. There were no significant differences between laps in the Player LoadRT or Impacts ( p > 0.05, trivial). The intraclass correlation and lineal correlation showed a substantial to almost perfect over-time test consistency assessed via reliability in both Player LoadRT and Impacts. Bias and t-test assessments showed good agreement between Laps. It can be concluded that MARGs sensors offer significant test re-test reliability and good agreement when assessing off-road kinematics in the six different body segments.


2021 ◽  
Author(s):  
A Wallin ◽  
M Kierkegaard ◽  
E Franzén ◽  
S Johansson

Abstract Objective The mini-BESTest is a balance measure for assessment of the underlying physiological systems for balance control in adults. Evaluations of test–retest reliability of the mini-BESTest in larger samples of people with multiple sclerosis (MS) are lacking. The purpose of this study was to investigate test–retest reliability of the mini-BESTest total and section sum scores and individual items in people with mild to moderate overall MS disability. Methods This study used a test–retest design in a movement laboratory setting. Fifty-four people with mild to moderate overall MS disability according to the Expanded Disability Status scale (EDSS) were included, with 28 in the mild subgroup (EDSS 2.0–3.5) and 26 in the moderate subgroup (EDSS 4.0–5.5). Test–retest reliability of the mini-BESTest was evaluated by repeated measurements taken 1 week apart. Reliability and measurement error were analyzed. Results Test–retest reliability for the total scores were considered good to excellent, with intraclass correlation coefficients of .88 for the whole sample, .83 for the mild MS subgroup, and .80 for the moderate MS subgroup. Measurement errors were small, with standard error of measurement and minimal detectable change of 1.3 and 3.5, respectively, in mild MS, and 1.7 and 4.7, respectively, in moderate MS. The limits of agreement were − 3.4 and 4.6. Test–retest reliability for the section scores were fair to good or excellent; weighted kappa values ranged from .62 to .83. All items but 1 showed fair to good or excellent test–retest reliability, and percentage agreement ranged from 61% to 100%. Conclusions The mini-BESTest demonstrated good to excellent test–retest reliability and small measurement errors and is recommended for use in people with mild to moderate MS. Impact Knowledge of limits of agreement and minimal detectable change contribute to interpretability of the mini-BESTest total score. The findings of this study enhance the clinical usefulness of the test for evaluation of balance control and for designing individually customized balance training with high precision and accuracy in people with MS.


Diagnostics ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 905
Author(s):  
Ahmed Elwali ◽  
Zahra Moussavi

Background: The apnea/hypopnea index (AHI) is the primary outcome of a polysomnography assessment (PSG) for determining obstructive sleep apnea (OSA) severity. However, other OSA severity parameters (i.e., total arousal index, mean oxygen saturation (SpO2%), etc.) are crucial for a full diagnosis of OSA and deciding on a treatment option. PSG assessments and home sleep tests measure these parameters, but there is no screening tool to estimate or predict the OSA severity parameters other than the AHI. In this study, we investigated whether a combination of breathing sounds recorded during wakefulness and anthropometric features could be predictive of PSG parameters. Methods: Anthropometric information and five tracheal breathing sound cycles were recorded during wakefulness from 145 individuals referred to an overnight PSG study. The dataset was divided into training, validation, and blind testing datasets. Spectral and bispectral features of the sounds were evaluated to run correlation and classification analyses with the PSG parameters collected from the PSG sleep reports. Results: Many sound and anthropometric features had significant correlations (up to 0.56) with PSG parameters. Using combinations of sound and anthropometric features in a bilinear model for each PSG parameter resulted in correlation coefficients up to 0.84. Using the evaluated models for classification with a two-class random-forest classifier resulted in a blind testing classification accuracy up to 88.8% for predicting the key PSG parameters such as arousal index. Conclusions: These results add new value to the current OSA screening tools and provide a new promising possibility for predicting PSG parameters using only a few seconds of breathing sounds recorded during wakefulness without conducting an overnight PSG study.


Sign in / Sign up

Export Citation Format

Share Document