scholarly journals Reliability and Validity of Commercially Available Wearable Devices for Measuring Steps, Energy Expenditure, and Heart Rate: Systematic Review (Preprint)

2020 ◽  
Author(s):  
Daniel Fuller ◽  
Emily Colwell ◽  
Jonathan Low ◽  
Kassia Orychock ◽  
Melissa Ann Tobin ◽  
...  

BACKGROUND Consumer-wearable activity trackers are small electronic devices that record fitness and health-related measures. OBJECTIVE The purpose of this systematic review was to examine the validity and reliability of commercial wearables in measuring step count, heart rate, and energy expenditure. METHODS We identified devices to be included in the review. Database searches were conducted in PubMed, Embase, and SPORTDiscus, and only articles published in the English language up to May 2019 were considered. Studies were excluded if they did not identify the device used and if they did not examine the validity or reliability of the device. Studies involving the general population and all special populations were included. We operationalized validity as criterion validity (as compared with other measures) and construct validity (degree to which the device is measuring what it claims). Reliability measures focused on intradevice and interdevice reliability. RESULTS We included 158 publications examining nine different commercial wearable device brands. Fitbit was by far the most studied brand. In laboratory-based settings, Fitbit, Apple Watch, and Samsung appeared to measure steps accurately. Heart rate measurement was more variable, with Apple Watch and Garmin being the most accurate and Fitbit tending toward underestimation. For energy expenditure, no brand was accurate. We also examined validity between devices within a specific brand. CONCLUSIONS Commercial wearable devices are accurate for measuring steps and heart rate in laboratory-based settings, but this varies by the manufacturer and device type. Devices are constantly being upgraded and redesigned to new models, suggesting the need for more current reviews and research.

10.2196/18694 ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. e18694 ◽  
Author(s):  
Daniel Fuller ◽  
Emily Colwell ◽  
Jonathan Low ◽  
Kassia Orychock ◽  
Melissa Ann Tobin ◽  
...  

Background Consumer-wearable activity trackers are small electronic devices that record fitness and health-related measures. Objective The purpose of this systematic review was to examine the validity and reliability of commercial wearables in measuring step count, heart rate, and energy expenditure. Methods We identified devices to be included in the review. Database searches were conducted in PubMed, Embase, and SPORTDiscus, and only articles published in the English language up to May 2019 were considered. Studies were excluded if they did not identify the device used and if they did not examine the validity or reliability of the device. Studies involving the general population and all special populations were included. We operationalized validity as criterion validity (as compared with other measures) and construct validity (degree to which the device is measuring what it claims). Reliability measures focused on intradevice and interdevice reliability. Results We included 158 publications examining nine different commercial wearable device brands. Fitbit was by far the most studied brand. In laboratory-based settings, Fitbit, Apple Watch, and Samsung appeared to measure steps accurately. Heart rate measurement was more variable, with Apple Watch and Garmin being the most accurate and Fitbit tending toward underestimation. For energy expenditure, no brand was accurate. We also examined validity between devices within a specific brand. Conclusions Commercial wearable devices are accurate for measuring steps and heart rate in laboratory-based settings, but this varies by the manufacturer and device type. Devices are constantly being upgraded and redesigned to new models, suggesting the need for more current reviews and research.


2020 ◽  
Vol 3 (2) ◽  
pp. 170-185 ◽  
Author(s):  
Kelly R. Evenson ◽  
Camden L. Spade

Purpose: A systematic review to summarize the validity and reliability of steps, distance, energy expenditure, speed, elevation, heart rate, and sleep assessed by Garmin activity trackers. Methods: Searches included studies published through December 31, 2018. Correlation coefficients (CC) were assessed as low (<0.60), moderate (0.60 to <0.75), good (0.75 to <0.90), or excellent (≥0.90). Mean absolute percentage errors (MAPE) were assessed as acceptable at <5% in controlled conditions and <10% for free-living conditions. Results: Overall, 32 studies of adults documented validity. Four of these studies also documented reliability. The sample size ranged from 1–95 for validity and 4–31 for reliability testing. Step inter- and intra-reliability was good-to-excellent and speed intra-reliability was excellent. No other features were explored for reliability. Step validity, across 16 studies, generally indicated good-to-excellent CC and acceptable MAPE. Distance validity, tested in three studies, generally indicated poor CC and MAPE that exceeded acceptable limits, with both over and underestimation. Energy expenditure validity, across 12 studies, generally indicated wide variability in CC and MAPE that exceeded acceptable limits. Heart rate validity in five studies had low-to-excellent CC and all MAPE exceeded acceptable limits. Speed, elevation, and sleep validity were assessed in only one or two studies each; for sleep, the criterion relied on self-report rather than polysomnography. Conclusion: This systematic review of Garmin activity trackers among adults indicated higher validity of steps; few studies on speed, elevation, and sleep; and lower validity for distance, energy expenditure, and heart rate. Intra- and inter-device feature reliability needs further testing.


2021 ◽  
Author(s):  
Federico Germini ◽  
Noella Noronha ◽  
Victoria Borg Debono ◽  
Binu Abraham Phillip ◽  
Drashti Pete ◽  
...  

BACKGROUND Numerous wrist-wearable devices to measure physical activity are currently available, but little is known about how they compare in terms of acceptability and accuracy. OBJECTIVE We performed a systematic review of the literature to assess the acceptability (defined as the level to which a device is tolerated and used by the user) and accuracy of wrist-wearable activity trackers. METHODS We searched MEDLINE, EMBASE, the Cochrane Central Register of Controlled Trials (CENTRAL) and SPORTDiscus for studies measuring physical activity in the general population, using wrist-wearable activity trackers. We screened articles for inclusion and, for included studies, reported data on the studies’ setting and population, outcome measured, and risk of bias. RESULTS 65 articles were included in our review. Acceptability was more frequently measured through data availability and wearing time. Data availability was ≥ 75% for FitBit Charge HR, FitBit Flex 2, and Garmin Vivofit. The wearing time was 89% for both GENE Activ and Nike Fuelband. Accuracy was assessed for 14 different outcomes, that can be classified in the following categories: count of specific activities (including step counts), time spent being active, intensity of physical activity (including energy expenditure), heart rate, distance, and speed. Substantial clinical heterogeneity did not allow to perform a meta-analysis of the results. The outcomes assessed more frequently were step counts, heart rate, and energy expenditure. For step counts, Fitbit Charge (or Charge HR) had a MAPE < 25% across 20 studies. For heart rate, Apple watch had a MAPE < 10% in 2 studies. For energy expenditure, the MAPE > 30% for all the brands, showing poor accuracy across devices. CONCLUSIONS Fitbit Charge and Charge HR were consistently shown to have a good accuracy for step counts and Apple watch for measuring heart rate. None of the tested devices proved to be accurate in measuring energy expenditure. Efforts should be made to reduce the heterogeneity between studies


Author(s):  
Abdullah Bandar Alansare ◽  
Lauren C. Bates ◽  
Lee Stoner ◽  
Christopher E. Kline ◽  
Elizabeth Nagle ◽  
...  

Purpose: To evaluate if sedentary time (ST) is associated with heart rate (HR) and variability (HRV) in adults. Methods: We systematically searched PubMed and Google Scholar through June 2020. Inclusion criteria were observational design, humans, adults, English language, ST as the exposure, resting HR/HRV as the outcome, and (meta-analysis only) availability of the quantitative association with variability. After qualitative synthesis, meta-analysis used inverse variance heterogeneity models to estimate pooled associations. Results: Thirteen and eight articles met the criteria for the systematic review and meta-analysis, respectively. All studies were cross-sectional and few used gold standard ST or HRV assessment methodology. The qualitative synthesis suggested no associations between ST and HR/HRV. The meta-analysis found a significant association between ST and HR (β = 0.24 bpm per hour ST; CI: 0.10, 0.37) that was stronger in males (β = 0.36 bpm per hour ST; CI: 0.19, 0.53). Pooled associations between ST and HRV indices were non-significant (p > 0.05). Substantial heterogeneity was detected. Conclusions: The limited available evidence suggests an unfavorable but not clinically meaningful association between ST and HR, but no association with HRV. Future longitudinal studies assessing ST with thigh-based monitoring and HRV with electrocardiogram are needed.


Author(s):  
Junqing Xie ◽  
Dong Wen ◽  
Lizhong Liang ◽  
Yuxi Jia ◽  
Li Gao ◽  
...  

BACKGROUND Wearable devices have attracted much attention from the market in recent years for their fitness monitoring and other health-related metrics; however, the accuracy of fitness tracking results still plays a major role in health promotion. OBJECTIVE The aim of this study was to evaluate the accuracy of a host of latest wearable devices in measuring fitness-related indicators under various seminatural activities. METHODS A total of 44 healthy subjects were recruited, and each subject was asked to simultaneously wear 6 devices (Apple Watch 2, Samsung Gear S3, Jawbone Up3, Fitbit Surge, Huawei Talk Band B3, and Xiaomi Mi Band 2) and 2 smartphone apps (Dongdong and Ledongli) to measure five major health indicators (heart rate, number of steps, distance, energy consumption, and sleep duration) under various activity states (resting, walking, running, cycling, and sleeping), which were then compared with the gold standard (manual measurements of the heart rate, number of steps, distance, and sleep, and energy consumption through oxygen consumption) and calculated to determine their respective mean absolute percentage errors (MAPEs). RESULTS Wearable devices had a rather high measurement accuracy with respect to heart rate, number of steps, distance, and sleep duration, with a MAPE of approximately 0.10, whereas poor measurement accuracy was observed for energy consumption (calories), indicated by a MAPE of up to 0.44. The measurements varied for the same indicator measured by different fitness trackers. The variation in measurement of the number of steps was the highest (Apple Watch 2: 0.42; Dongdong: 0.01), whereas it was the lowest for heart rate (Samsung Gear S3: 0.34; Xiaomi Mi Band 2: 0.12). Measurements differed insignificantly for the same indicator measured under different states of activity; the MAPE of distance and energy measurements were in the range of 0.08 to 0.17 and 0.41 to 0.48, respectively. Overall, the Samsung Gear S3 performed the best for the measurement of heart rate under the resting state (MAPE of 0.04), whereas Dongdong performed the best for the measurement of the number of steps under the walking state (MAPE of 0.01). Fitbit Surge performed the best for distance measurement under the cycling state (MAPE of 0.04), and Huawei Talk Band B3 performed the best for energy consumption measurement under the walking state (MAPE of 0.17). CONCLUSIONS At present, mainstream devices are able to reliably measure heart rate, number of steps, distance, and sleep duration, which can be used as effective health evaluation indicators, but the measurement accuracy of energy consumption is still inadequate. Fitness trackers of different brands vary with regard to measurement of indicators and are all affected by the activity state, which indicates that manufacturers of fitness trackers need to improve their algorithms for different activity states.


2019 ◽  
Vol 11 (5) ◽  
pp. 409-415 ◽  
Author(s):  
Fábio Carlos Lucas de Oliveira ◽  
Anny Fredette ◽  
Sherezada Ochoa Echeverría ◽  
Charles Sebiyo Batcho ◽  
Jean-Sébastien Roy

Context: Two-dimensional (2D) video-based analysis is often used by clinicians to examine the foot strike pattern (FSP) and step rate in runners. Reliability and validity of 2D video-based analysis have been questioned. Objective: To synthesize the psychometric properties of 2D video-based analysis for assessing runners’ FSP and step rate while running. Data Sources: Medline/PubMed, Science Direct, Embase, EBSCOHost/CINAHL, and Scielo were searched from their inception to August 2018. Study Selection: Studies were included if (1) they were published in English, French, Portuguese or Spanish; (2) they reported at least 1 psychometric property (validity and/or reliability) of 2D video-based analysis to assess running kinematics; and (3) they assessed FSP or step rate during running. Study Design: Systematic review. Level of Evidence: Level 2. Data Extraction: Studies were screened for methodological (MacDermid checklist) and psychometric quality (COSMIN checklist) by 2 independent raters. Results: Eight studies, with a total of 702 participants, were included. Seven studies evaluated the reliability of 2D video to assess FSP and found very good to excellent reliability (0.41 ≤ κ ≤ 1.00). Two studies reported excellent reliability for the calculation of step rate (0.75 ≤ intraclass correlation coefficient [ICC] ≤ 1.00). One study demonstrated excellent concurrent validity between 2D and 3D (gold standard) motion capture systems to determine FSP (Gwet agreement coefficient [AC] > 0.90; ICC > 0.90), and another study found excellent concurrent validity between 2D video and another device to calculate step rate (0.84 ≤ ICC ≤ 0.95). Conclusion: Strong evidence suggests that 2D video-based analysis is a reliable method for assessing FSP and quantifying step rate, regardless of the experience of the assessor. Limited evidence exists on the validity of 2D video-based analysis in determining FSP and calculating step rate during running.


2014 ◽  
Vol 11 (3) ◽  
pp. 654-664 ◽  
Author(s):  
Andreas Wolff Hansen ◽  
Inger Dahl-Petersen ◽  
Jørn Wulff Helge ◽  
Søren Brage ◽  
Morten Grønbæk ◽  
...  

Background:The International Physical Activity Questionnaire (IPAQ) is commonly used in surveys, but reliability and validity has not been established in the Danish population.Methods:Among participants in the Danish Health Examination survey 2007–2008, 142 healthy participants (45% men) wore a unit that combined accelerometry and heart rate monitoring (Acc+HR) for 7 consecutive days and then completed the IPAQ. Background data were obtained from the survey. Physical activity energy expenditure (PAEE) and time in moderate, vigorous, and sedentary intensity levels were derived from the IPAQ and compared with estimates from Acc+HR using Spearman’s correlation coefficients and Bland-Altman plots. Repeatability of the IPAQ was also assessed.Results:PAEE from the 2 methods was significantly positively correlated (0.29 and 0.49; P = 0.02 and P < 0.001; for women and men, respectively). Men significantly overestimated PAEE by IPAQ (56.2 vs 45.3 kJ/kg/day, IPAQ: Acc+HR, P < .01), while the difference was nonsignificant for women (40.8 vs 44.4 kJ/kg/day). Bland-Altman plots showed that the IPAQ overestimated PAEE, moderate, and vigorous activity without systematic error. Reliability of the IPAQ was moderate to high for all domains and intensities (total PAEE intraclass correlation coefficient = 0.58).Conclusions:This Danish Internet-based version of the long IPAQ had modest validity and reliability when assessing PAEE at population level.


2009 ◽  
Vol 23 (2) ◽  
pp. 107-132 ◽  
Author(s):  
Rosemary J. Herbert ◽  
Anita J. Gagnon ◽  
Janet E. Rennick ◽  
Jennifer L. O’Loughlin

The objective of this systematic review was to identify questionnaires that measure health-related empowerment in adults or families and demonstrated the best evidence of reliability and validity. A search of nine data bases identified 8,269 abstracts that referred to empowerment. Full article review was completed for abstracts that met the inclusion criteria or that could not be excluded with certainty (n = 124). Fifty distinct, modified, or translated questionnaires measuring empowerment were identified in 74 articles. Each was rated in terms of reliability and validity. One questionnaire had good evidence of reliability and validity, four had moderate evidence, and 45 had limited or no evidence. Limited or no evidence for reliability and validity for many questionnaires could relate in part to lack of consensus on the theoretical definition of, and indicators for measuring empowerment. We recommend that researchers use the questionnaire rated as having good evidence and that data on reliability and validity continue to be reported for other questionnaires.


2018 ◽  
Author(s):  
Alina Trifan ◽  
Maryse Oliveira ◽  
José Luís Oliveira

BACKGROUND Technological advancements, together with the decrease in both price and size of a large variety of sensors, has expanded the role and capabilities of regular mobile phones, turning them into powerful yet ubiquitous monitoring systems. At present, smartphones have the potential to continuously collect information about the users, monitor their activities and behaviors in real time, and provide them with feedback and recommendations. OBJECTIVE This systematic review aimed to identify recent scientific studies that explored the passive use of smartphones for generating health- and well-being–related outcomes. In addition, it explores users’ engagement and possible challenges in using such self-monitoring systems. METHODS A systematic review was conducted, following Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, to identify recent publications that explore the use of smartphones as ubiquitous health monitoring systems. We ran reproducible search queries on PubMed, IEEE Xplore, ACM Digital Library, and Scopus online databases and aimed to find answers to the following questions: (1) What is the study focus of the selected papers? (2) What smartphone sensing technologies and data are used to gather health-related input? (3) How are the developed systems validated? and (4) What are the limitations and challenges when using such sensing systems? RESULTS Our bibliographic research returned 7404 unique publications. Of these, 118 met the predefined inclusion criteria, which considered publication dates from 2014 onward, English language, and relevance for the topic of this review. The selected papers highlight that smartphones are already being used in multiple health-related scenarios. Of those, physical activity (29.6%; 35/118) and mental health (27.9; 33/118) are 2 of the most studied applications. Accelerometers (57.7%; 67/118) and global positioning systems (GPS; 40.6%; 48/118) are 2 of the most used sensors in smartphones for collecting data from which the health status or well-being of its users can be inferred. CONCLUSIONS One relevant outcome of this systematic review is that although smartphones present many advantages for the passive monitoring of users’ health and well-being, there is a lack of correlation between smartphone-generated outcomes and clinical knowledge. Moreover, user engagement and motivation are not always modeled as prerequisites, which directly affects user adherence and full validation of such systems.


2020 ◽  
Vol 8 (4) ◽  
pp. 120
Author(s):  
Reem Naaman ◽  
Azza A. El-Housseiny ◽  
Najlaa Alamoudi ◽  
Narmin Helal ◽  
Rahaf Sahhaf

This study aims to translate a previously published English language questionnaire that assessed pain and discomfort after the extraction of primary teeth in children into Arabic, and evaluate its validity and reliability. All participating children (n = 120), aged 9 to 12-years-old, completed the 33-item Arabic version questionnaire after the extraction procedure had taken place. The questionnaire included three parts that were completed at three different times, namely, immediately, the first evening, and one week after the extraction procedure. Internal consistency, content validity, criterion validity, and factor analysis were performed. The results showed a good internal consistency (Cronbach’s alpha = 0.83), acceptable criterion validity with a significantly strong correlation with the Visual Analog Scale (VAS), and satisfactory content validity (average content validity index (CVI = 0.90). The final factor model was comprised of four factors with an eigenvalue greater than 1, explaining 70% of the common variance. The identified factors were labeled as follows: Factor 1—analgesic consumption; Factor 2—expression of discomfort from the extraction site; Factor 3—perception of masticatory capability; and Factor 4—pain/discomfort from the dental extraction procedure. Based on the results, a shorter form of the questionnaire had satisfactory psychometric characteristics and can be used with children within the selected age group.


Sign in / Sign up

Export Citation Format

Share Document