Building FAIR functionality: Annotating event-related imaging data using Hierarchical Event Descriptors (HED)

Mapping Intimacies ◽

10.31219/osf.io/5fg73 ◽

2020 ◽

Author(s):

Kay Robbins ◽

Dung Truong ◽

Alexander Jones ◽

Ian Callanan ◽

Scott Makeig

Keyword(s):

Large Scale ◽

Time Series Data ◽

Meta Analysis ◽

Ease Of Use ◽

Current Status ◽

Series Data ◽

Imaging Data ◽

Related Data ◽

Large Scale Analysis ◽

Effective Level

In fields such as human electrophysiology, high-precision time series data is often acquired in complex, event-rich environments for interpretation of complex dynamic data features in the context of session events. However, a substantial gap exists between the level of event description information required by current digital research archive standards and the level of annotation required for successful meta-analysis or mega-analysis of event-related data across studies, systems, and laboratories. Manifold challenges, most prominently ontological clarity and extensibility, tool availability, and ease of use must be addressed to allow and promote sharing of data with an effective level of descriptive detail for labeled events. Motivating data authors to perform the work needed to adequately annotate their data is a key challenge. This paper describes the near decade-long development of the Hierarchical Event Descriptor (HED) system for addressing these issues. We discuss the evolution of HED, the lessons we have learned, the current status of HED vocabulary and tools, some generally-applicable design principles for annotation framework development, and a roadmap for future development. We believe that without consistent, sufficiently detailed and field-relevant annotations as to the nature of each recorded event, the potential value of data sharing and large-scale analysis in behavioral and brain imaging sciences will not be realized.

Download Full-text

Building FAIR Functionality: Annotating Events in Time Series Data Using Hierarchical Event Descriptors (HED)

Neuroinformatics ◽

10.1007/s12021-021-09537-4 ◽

2021 ◽

Author(s):

Kay Robbins ◽

Dung Truong ◽

Alexander Jones ◽

Ian Callanan ◽

Scott Makeig

Keyword(s):

Time Series ◽

Brain Imaging ◽

Large Scale ◽

Time Series Data ◽

Digital Data ◽

Series Data ◽

Imaging Data ◽

Data Archiving ◽

Related Data ◽

Effective Level

AbstractHuman electrophysiological and related time series data are often acquired in complex, event-rich environments. However, the resulting recorded brain or other dynamics are often interpreted in relation to more sparsely recorded or subsequently-noted events. Currently a substantial gap exists between the level of event description required by current digital data archiving standards and the level of annotation required for successful analysis of event-related data across studies, environments, and laboratories. Manifold challenges must be addressed, most prominently ontological clarity, vocabulary extensibility, annotation tool availability, and overall usability, to allow and promote sharing of data with an effective level of descriptive detail for labeled events. Motivating data authors to perform the work needed to adequately annotate their data is a key challenge. This paper describes new developments in the Hierarchical Event Descriptor (HED) system for addressing these issues. We recap the evolution of HED and its acceptance by the Brain Imaging Data Structure (BIDS) movement, describe the recent release of HED-3G, a third generation HED tools and design framework, and discuss directions for future development. Given consistent, sufficiently detailed, tool-enabled, field-relevant annotation of the nature of recorded events, prospects are bright for large-scale analysis and modeling of aggregated time series data, both in behavioral and brain imaging sciences and beyond.

Download Full-text

Capturing the nature of events and event context using Hierarchical Event Descriptors (HED)

10.1101/2021.05.06.442841 ◽

2021 ◽

Author(s):

Kay A. Robbins ◽

Dung Truong ◽

Stefan Appelhoff ◽

Arnaud Delorme ◽

Scott Makeig

Keyword(s):

Time Series Data ◽

Full Range ◽

Series Data ◽

Imaging Data ◽

Related Data ◽

Annotated Edition ◽

Neuroimaging Data ◽

Brain Imaging Data ◽

Event Annotation ◽

The Impact

Because of the central role that event-related data analysis plays in EEG and MEG (MEEG) experiments, choices about which events to report and how to annotate their full natures can significantly influence the reliability, reproducibility, and value of MEEG datasets for further analysis. Current, more powerful annotation strategies combine robust event description with details of experiment design and metadata in a human-readable as well as machine-actionable form, making event annotation relevant to the full range of neuroimaging and other time series data. This paper dissects the event design and annotation process using as a case study the well-known multi-subject, multimodal dataset of Wakeman and Henson (openneuro.org, ds000117) shared by its authors using Brain Imaging Data Structure (BIDS) formatting (bids.neuroimaging.io). We propose a set of best practices and guidelines for event handling in MEEG research, examine the impact of various design decisions, and provide a working template for organizing events in MEEG and other neuroimaging data. We demonstrate how annotations using the new third-generation formulation of the Hierarchical Event Descriptors (HED-3G) framework and tools (hedtags.org) can document events occurring during neuroimaging experiments and their interrelationships, providing machine-actionable annotation enabling automated both within- and across-study comparisons and analysis, and point to a more complete BIDS formatted, HED-3G annotated edition of the MEEG portion of the Wakeman and Henson dataset (OpenNeuro ds003645).

Download Full-text

MetaCycle: an integrated R package to evaluate periodicity in large scale data

10.1101/040345 ◽

2016 ◽

Cited By ~ 6

Author(s):

Gang Wu ◽

Ron C Anafi ◽

Michael E Hughes ◽

Karl Kornacker ◽

John B Hogenesch

Keyword(s):

Statistical Power ◽

Large Scale ◽

Time Series Data ◽

R Package ◽

Ease Of Use ◽

Data Availability ◽

Supplementary Information ◽

Series Data ◽

Large Scale Data ◽

Scale Data

Summary: Detecting periodicity in large scale data remains a challenge. Different algorithms offer strengths and weaknesses in statistical power, sensitivity to outliers, ease of use, and sampling requirements. While efforts have been made to identify best of breed algorithms, relatively little research has gone into integrating these methods in a generalizable method. Here we present MetaCycle, an R package that incorporates ARSER, JTK_CYCLE, and Lomb-Scargle to conveniently evaluate periodicity in time-series data. Availability and implementation: MetaCycle package is available on the CRAN repository (https://cran.r-project.org/web/packages/MetaCycle/index.html) and GitHub (https://github.com/gangwug/MetaCycle). Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

The Classic Maya Collapse: Testing Class Conflict Hypotheses

American Antiquity ◽

10.2307/279284 ◽

1980 ◽

Vol 45 (2) ◽

pp. 246-267 ◽

Cited By ~ 25

Author(s):

Robert L. Hamblin ◽

Brian L. Pitcher

Keyword(s):

Large Scale ◽

Time Series Data ◽

Classic Maya ◽

Class Conflict ◽

Series Data ◽

Geographical Patterns ◽

Historical Societies ◽

Testing Class ◽

Lowland Area ◽

Maya Collapse

Several lines of archaeological evidence are presented in this paper to suggest the existence of class warfare among the Classic Maya and of issues that historically have been associated with class conflict. This evidence indicates that class warfare may have halted the rule of the monument-producing, or Classic, elites and precipitated the depopulation of the lowland area. The theory is evaluated quantitatively by testing for time-related mathematical patterns that have been found to characterize large-scale conflicts in historical societies. The information used in the evaluation involves the time series data on the duration of rule by Classic elites as inferred from the production of monuments with Long Count dates at a sample of 82 ceremonial centers. The analyses confirm that the Maya data do exhibit the temporal and geographical patterns predicted from the class conflict explanation of the Classic Maya collapse. Alternative predictions from the other theories are considered but generally not found to be supported by these data.

Download Full-text

Evidence Graphs: Supporting Transparent and FAIR Computation, with Defeasible Reasoning on Data, Methods and Results

10.1101/2021.03.29.437561 ◽

2021 ◽

Author(s):

Sadnan Al Manir ◽

Justin Niestroy ◽

Maxwell Adam Levinson ◽

Timothy Clark

Keyword(s):

Time Series ◽

Large Scale ◽

Time Series Data ◽

Predictive Analytics ◽

Defeasible Reasoning ◽

Series Data ◽

Inference Rules ◽

Deep Networks ◽

Evidence Graph ◽

Over Time

Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.

Download Full-text

Differentially Private Autocorrelation Time-Series Data Publishing Based on Sliding Window

Security and Communication Networks ◽

10.1155/2021/6665984 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Jing Zhao ◽

Shubo Liu ◽

Xingxing Xiong ◽

Zhaohui Cai

Keyword(s):

Time Series ◽

Privacy Protection ◽

Large Scale ◽

Differential Privacy ◽

Time Series Data ◽

Sliding Window ◽

Data Publishing ◽

Series Data ◽

Data Publication ◽

Autocorrelation Time

Privacy protection is one of the major obstacles for data sharing. Time-series data have the characteristics of autocorrelation, continuity, and large scale. Current research on time-series data publication mainly ignores the correlation of time-series data and the lack of privacy protection. In this paper, we study the problem of correlated time-series data publication and propose a sliding window-based autocorrelation time-series data publication algorithm, called SW-ATS. Instead of using global sensitivity in the traditional differential privacy mechanisms, we proposed periodic sensitivity to provide a stronger degree of privacy guarantee. SW-ATS introduces a sliding window mechanism, with the correlation between the noise-adding sequence and the original time-series data guaranteed by sequence indistinguishability, to protect the privacy of the latest data. We prove that SW-ATS satisfies ε-differential privacy. Compared with the state-of-the-art algorithm, SW-ATS is superior in reducing the error rate of MAE which is about 25%, improving the utility of data, and providing stronger privacy protection.

Download Full-text

A deep learning method for data recovery in sensor networks using effective spatio-temporal correlation data

Sensor Review ◽

10.1108/sr-02-2018-0039 ◽

2019 ◽

Vol 39 (2) ◽

pp. 208-217 ◽

Cited By ~ 2

Author(s):

Jinghan Du ◽

Haiyan Chen ◽

Weining Zhang

Keyword(s):

Deep Learning ◽

Sensor Networks ◽

Large Scale ◽

Time Series Data ◽

Temporal Correlation ◽

Data Recovery ◽

Series Data ◽

Learning Method ◽

Content Type ◽

Spatio Temporal

Purpose In large-scale monitoring systems, sensors in different locations are deployed to collect massive useful time-series data, which can help in real-time data analytics and its related applications. However, affected by hardware device itself, sensor nodes often fail to work, resulting in a common phenomenon that the collected data are incomplete. The purpose of this study is to predict and recover the missing data in sensor networks. Design/methodology/approach Considering the spatio-temporal correlation of large-scale sensor data, this paper proposes a data recover model in sensor networks based on a deep learning method, i.e. deep belief network (DBN). Specifically, when one sensor fails, the historical time-series data of its own and the real-time data from surrounding sensor nodes, which have high similarity with a failure observed using the proposed similarity filter, are collected first. Then, the high-level feature representation of these spatio-temporal correlation data is extracted by DBN. Moreover, to determine the structure of a DBN model, a reconstruction error-based algorithm is proposed. Finally, the missing data are predicted based on these features by a single-layer neural network. Findings This paper collects a noise data set from an airport monitoring system for experiments. Various comparative experiments show that the proposed algorithms are effective. The proposed data recovery model is compared with several other classical models, and the experimental results prove that the deep learning-based model can not only get a better prediction accuracy but also get a better performance in training time and model robustness. Originality/value A deep learning method is investigated in data recovery task, and it proved to be effective compared with other previous methods. This might provide a practical experience in the application of a deep learning method.

Download Full-text

How to Identify Varying Lead–Lag Effects in Time Series Data: Implementation, Validation, and Application of the Generalized Causality Algorithm

Algorithms ◽

10.3390/a13040095 ◽

2020 ◽

Vol 13 (4) ◽

pp. 95 ◽

Cited By ~ 1

Author(s):

Johannes Stübinger ◽

Katharina Adler

Keyword(s):

Time Series ◽

Large Scale ◽

Structural Breaks ◽

Time Series Data ◽

Consumer Price Index ◽

Real Data ◽

Linear Mapping ◽

Series Data ◽

Lag Effects ◽

Silver Metal

This paper develops the generalized causality algorithm and applies it to a multitude of data from the fields of economics and finance. Specifically, our parameter-free algorithm efficiently determines the optimal non-linear mapping and identifies varying lead–lag effects between two given time series. This procedure allows an elastic adjustment of the time axis to find similar but phase-shifted sequences—structural breaks in their relationship are also captured. A large-scale simulation study validates the outperformance in the vast majority of parameter constellations in terms of efficiency, robustness, and feasibility. Finally, the presented methodology is applied to real data from the areas of macroeconomics, finance, and metal. Highest similarity show the pairs of gross domestic product and consumer price index (macroeconomics), S&P 500 index and Deutscher Aktienindex (finance), as well as gold and silver (metal). In addition, the algorithm takes full use of its flexibility and identifies both various structural breaks and regime patterns over time, which are (partly) well documented in the literature.

Download Full-text

Applying multiple time series data mining to large-scale network traffic analysis

2008 IEEE Conference on Cybernetics and Intelligent Systems ◽

10.1109/iccis.2008.4670844 ◽

2008 ◽

Cited By ~ 1

Author(s):

Weisong He ◽

Guangmin Hu ◽

Xingmiao Yao ◽

Guangyuan Kan ◽

Hong Wang ◽

...

Keyword(s):

Data Mining ◽

Time Series ◽

Large Scale ◽

Time Series Data ◽

Series Data ◽

Multiple Time ◽

Multiple Time Series ◽

Network Traffic Analysis ◽

Large Scale Network ◽

Scale Network

Download Full-text

Clustering of large scale QoS time series data in federated clouds using improved variable Chromosome Length Genetic Algorithm (CQGA)

Expert Systems with Applications ◽

10.1016/j.eswa.2020.113840 ◽

2021 ◽

Vol 164 ◽

pp. 113840

Author(s):

Amin Keshavarzi ◽

Abolfazl Toroghi Haghighat ◽

Mahdi Bohlouli

Keyword(s):

Genetic Algorithm ◽

Time Series ◽

Large Scale ◽

Time Series Data ◽

Chromosome Length ◽

Series Data

Download Full-text