scholarly journals Reproducible Analysis Pipeline for Data Streams: Open-Source Software to Process Data Collected With Mobile Devices

2021 ◽  
Vol 3 ◽  
Author(s):  
Julio Vega ◽  
Meng Li ◽  
Kwesi Aguillera ◽  
Nikunj Goel ◽  
Echhit Joshi ◽  
...  

Smartphone and wearable devices are widely used in behavioral and clinical research to collect longitudinal data that, along with ground truth data, are used to create models of human behavior. Mobile sensing researchers often program data processing and analysis code from scratch even though many research teams collect data from similar mobile sensors, platforms, and devices. This leads to significant inefficiency in not being able to replicate and build on others' work, inconsistency in quality of code and results, and lack of transparency when code is not shared alongside publications. We provide an overview of Reproducible Analysis Pipeline for Data Streams (RAPIDS), a reproducible pipeline to standardize the preprocessing, feature extraction, analysis, visualization, and reporting of data streams coming from mobile sensors. RAPIDS is formed by a group of R and Python scripts that are executed on top of reproducible virtual environments, orchestrated by a workflow management system, and organized following a consistent file structure for data science projects. We share open source, documented, extensible and tested code to preprocess, extract, and visualize behavioral features from data collected with any Android or iOS smartphone sensing app as well as Fitbit and Empatica wearable devices. RAPIDS allows researchers to process mobile sensor data in a rigorous and reproducible way. This saves time and effort during the data analysis phase of a project and facilitates sharing analysis workflows alongside publications.

2020 ◽  
Author(s):  
Julio Vega ◽  
Meng Li ◽  
Kwesi Aguillera ◽  
Nikunj Goel ◽  
Echhit Joshi ◽  
...  

BACKGROUND Smartphone and wearable devices are widely used in behavioral and clinical research to collect longitudinal data that, along with ground truth data, are used to create models of human behavior. Mobile sensing researchers often program analysis code from scratch even though many research teams collect data from similar mobile sensors, platforms and devices. As a result, the quality of code varies, code is often not shared alongside publications, and when it is, it might not be stored on a version control system and most of the time there is no guarantee the development environment can be replicated. This makes it difficult for other scientists to read, reuse, audit, and reproduce a publication’s code and its results. OBJECTIVE We present RAPIDS, a reproducible pipeline to standardize the preprocessing, feature extraction, analysis, visualization, and reporting of data streams coming from mobile sensors. METHODS RAPIDS is formed by a group of R and Python scripts that are executed on top of reproducible virtual environments, orchestrated by Snakemake and organized following the cookiecutter data science project. Its development has been and will be informed by public discussions with the mobile sensing research community. RESULTS We share open source, documented, extensible and tested code to preprocess and extract behavioral features from data collected with the AWARE Framework in Android and iOS smartphones as well as Fitbit devices. We also provide a file structure and development environment that other researchers can follow to publish their own models, visualizations, and reports. CONCLUSIONS RAPIDS allows researchers to process mobile sensor data in a rigorous and reproducible way. This saves time and effort during the data analysis phase of a project and makes it easier to share an analysis workflow alongside publications.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2944
Author(s):  
Benjamin James Ralph ◽  
Marcel Sorger ◽  
Benjamin Schödinger ◽  
Hans-Jörg Schmölzer ◽  
Karin Hartl ◽  
...  

Smart factories are an integral element of the manufacturing infrastructure in the context of the fourth industrial revolution. Nevertheless, there is frequently a deficiency of adequate training facilities for future engineering experts in the academic environment. For this reason, this paper describes the development and implementation of two different layer architectures for the metal processing environment. The first architecture is based on low-cost but resilient devices, allowing interested parties to work with mostly open-source interfaces and standard back-end programming environments. Additionally, one proprietary and two open-source graphical user interfaces (GUIs) were developed. Those interfaces can be adapted front-end as well as back-end, ensuring a holistic comprehension of their capabilities and limits. As a result, a six-layer architecture, from digitization to an interactive project management tool, was designed and implemented in the practical workflow at the academic institution. To take the complexity of thermo-mechanical processing in the metal processing field into account, an alternative layer, connected with the thermo-mechanical treatment simulator Gleeble 3800, was designed. This framework is capable of transferring sensor data with high frequency, enabling data collection for the numerical simulation of complex material behavior under high temperature processing. Finally, the possibility of connecting both systems by using open-source software packages is demonstrated.


Author(s):  
Wei Sun ◽  
Ethan Stoop ◽  
Scott S. Washburn

Florida’s interstate rest areas are heavily utilized by commercial trucks for overnight parking. Many of these rest areas regularly experience 100% utilization of available commercial truck parking spaces during the evening and early-morning hours. Being able to communicate availability of commercial truck parking space to drivers in advance of arriving at a rest area would reduce unnecessary stops at full rest areas as well as driver anxiety. In order to do this, it is critical to implement a vehicle detection technology to reflect the parking status of the rest area correctly. The objective of this project was to evaluate three different wireless in-pavement vehicle detection technologies as applied to commercial truck parking at interstate rest areas. This paper mainly focuses on the following aspects: (a) accuracy of the vehicle detection in parking spaces, (b) installation, setup, and maintenance of the vehicle detection technology, and (c) truck parking trends at the rest area study site. The final project report includes a more detailed summary of the evaluation. The research team recorded video of the rest areas as the ground-truth data and developed a software tool to compare the video data with the parking sensor data. Two accuracy tests (event accuracy and occupancy accuracy) were conducted to evaluate each sensor’s ability to reflect the status of each parking space correctly. Overall, it was found that all three technologies performed well, with accuracy rates of 95% or better for both tests. This result suggests that, for implementation, pricing, and/or maintenance issues may be more significant factors for the choice of technology.


2018 ◽  
Vol 14 (11) ◽  
pp. 155014771881130 ◽  
Author(s):  
Jaanus Kaugerand ◽  
Johannes Ehala ◽  
Leo Mõtus ◽  
Jürgo-Sören Preden

This article introduces a time-selective strategy for enhancing temporal consistency of input data for multi-sensor data fusion for in-network data processing in ad hoc wireless sensor networks. Detecting and handling complex time-variable (real-time) situations require methodical consideration of temporal aspects, especially in ad hoc wireless sensor network with distributed asynchronous and autonomous nodes. For example, assigning processing intervals of network nodes, defining validity and simultaneity requirements for data items, determining the size of memory required for buffering the data streams produced by ad hoc nodes and other relevant aspects. The data streams produced periodically and sometimes intermittently by sensor nodes arrive to the fusion nodes with variable delays, which results in sporadic temporal order of inputs. Using data from individual nodes in the order of arrival (i.e. freshest data first) does not, in all cases, yield the optimal results in terms of data temporal consistency and fusion accuracy. We propose time-selective data fusion strategy, which combines temporal alignment, temporal constraints and a method for computing delay of sensor readings, to allow fusion node to select the temporally compatible data from received streams. A real-world experiment (moving vehicles in urban environment) for validation of the strategy demonstrates significant improvement of the accuracy of fusion results.


2014 ◽  
pp. 291-321 ◽  
Author(s):  
Stephen Voida ◽  
Donald J. Patterson ◽  
Shwetak N. Patel
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document