scholarly journals A brief introduction to the analysis of time-series data from biologging studies

2021 ◽  
Vol 376 (1831) ◽  
pp. 20200227
Author(s):  
Xavier A. Harrison

Recent advances in tagging and biologging technology have yielded unprecedented insights into wild animal physiology. However, time-series data from such wild tracking studies present numerous analytical challenges owing to their unique nature, often exhibiting strong autocorrelation within and among samples, low samples sizes and complicated random effect structures. Gleaning robust quantitative estimates from these physiological data, and, therefore, accurate insights into the life histories of the animals they pertain to, requires careful and thoughtful application of existing statistical tools. Using a combination of both simulated and real datasets, I highlight the key pitfalls associated with analysing physiological data from wild monitoring studies, and investigate issues of optimal study design, statistical power, and model precision and accuracy. I also recommend best practice approaches for dealing with their inherent limitations. This work will provide a concise, accessible roadmap for researchers looking to maximize the yield of information from complex and hard-won biologging datasets. This article is part of the theme issue ‘Measuring physiology in free-living animals (Part II)’.

2017 ◽  
Author(s):  
Easton R White

Long-term time series are necessary to better understand population dynamics, assess species' conservation status, and make management decisions. However, population data are often expensive, requiring a lot of time and resources. When is a population time series long enough to address a question of interest? We determine the minimum time series length required to detect significant increases or decreases in population abundance. To address this question, we use simulation methods and examine 878 populations of vertebrate species. Here we show that 15-20 years of continuous monitoring are required in order to achieve a high level of statistical power. For both simulations and the time series data, the minimum time required depends on trend strength, population variability, and temporal autocorrelation. These results point to the importance of sampling populations over long periods of time. We argue that statistical power needs to be considered in monitoring program design and evaluation. Time series less than 15-20 years are likely underpowered and potentially misleading.


2003 ◽  
Vol 4 (1) ◽  
pp. 59-74
Author(s):  
Telisa Aulia Falianty

Econometric models have been played an increasingly important role in empirical analysis in economics. This paper provides an overview on some advanced econometric methods that increasingly used in empirical studies.A panel data combines features of both time series and cross section data. Because of increasing availability of panel data in economic sciences, panel data regression models are being increasingly used by researcher. Related to panel data model, there are some methods that will be discussed here such as fixed effect and random effect. A new approach to panel data that developed by Im, Shin, and Pesaran (2002) for testing unit root in heterogenous panel is included in this overview.When we work with time series data, there are many problems that we must handle, most of them are unit root test, cointegration among non stationary variables, and autoregressive conditional heteroscedasticity. Provided these problems, author also review about ADF and Philips-Perron test. An approch to cointegration analysis developed by Pesaran (1999), ARCH and GARCH model are also interesting to be discussed here.Bayesian econometric, that less known than classical econometric, is includcd in this overview. The genctic algorithm, a relatively new method in econometric, has bcen increasingly employed the behavior of economic agents in macroeconomic models. The genetic algorithm is based on thc process of Darwin’s Theory of Evolution. By starting with a set of potential solutions and changing them during several iterations, the Genetic Algorithm hopes to converge on the most ‘fit’ solutions.


2019 ◽  
Author(s):  
Catherine Inibhunu ◽  
Carolyn McGregor

BACKGROUND High frequency data collected from monitors and sensors that provide measures relating to patients’ vital status in intensive care units (NICUs) has the potential to provide valuable insights which can be crucial when making critical decisions for the care of premature and ill term infants. However, this exercise is not trivial when faced with huge volumes of data that are captured every second at the bedside/home. The ability to collect, analyze and understand any hidden relationships in the data that may be vital for clinical decision making is a central challenge. OBJECTIVE The main goal of this research is to develop a method to detect and represent relationships that may exist in temporal abstractions (TA) and temporal patterns (TP) derived from time oriented data. The premise of this research is that in clinical care, the discovery of unknown relationships among physiological time oriented data can lead to detection of onset of conditions, aid in classifying abnormal or normal behaviors or derive patterns of an altered trajectory towards a problematic future state for a patient. That is, there is great potential to use this approach to uncover previously unknown pathophysiologies that are present in high speed physiological data. METHODS This research introduces a TPR process and an associated TPRMine algorithm which adopts a stepwise approach to temporal pattern discovery by first applying a scaled mathematical formulation of the time series data. This is achieved by modelling the problem space as a finite state machine representation where for a given timeframe, a time series data segment transitions from one state to another based on probabilistic weights and then quantifying the many paths a time series data may transition to. RESULTS The TPRMine Algorithm has been designed, implemented and applied to patient physiological data streams captured from the McMaster Children’s Hospital NICU. The algorithm has been applied to understand the number of states a patient in a NICU bed can transition to in a given time period and a demonstration of formulation of hypothesis tests. In addition, a quantification of these states is completed leading to creation of a vital scoring. With this, it’s possible to understand the percent of time a patient remains in a high or low vital score. CONCLUSIONS The developed method allows understanding the number of states a patient may transition to in any given time period. Adding some clinical context to the identified states facilitates state quantification allowing formulation of thresholds which leads to generating patient scores. This is an approach that can be utilized for identifying patient at risk of some clinical condition prior to disease progress. Additionally the developed method facilitates identification of frequent patterns that could be associated with generated thresholds.


2020 ◽  
Author(s):  
Sebastian Seibold ◽  
Torsten Hothorn ◽  
Martin M. Gossner ◽  
Nadja K. Siimons ◽  
Nico Blüthgen ◽  
...  

Reports of major losses in biodiversity have stimulated an increasing interest in temporal population changes, particularly in insects, which had received little attention in the past. Existing long-term datasets are often limited to a small number of study sites, few points in time, a narrow range of land-use intensities and only some taxonomic groups, or they lack standardized sampling. While new multi-site monitoring programs have been initiated, most of them still cover rather short time periods. Daskalova et al. 20201 argue that temporal trends of insect populations derived from short time series are biased towards extreme trends, while their own analysis of an assembly of shorter- and longer-term time series does not support an overall insect decline. With respect to the results of Seibold et al.2 based on a 10-year multi-site time series, they claim that the analysis suffers from not accounting for temporal pseudoreplication. In this note, we explain why the criticism of missing statistical rigour in the analysis of Seibold et al.2 is not warranted. Models that include ‘year’ as random effect, as suggested by Daskalova et al. 2020, fail to detect non-linear trends and assume that consecutive years are independent samples which is questionable for insect time-series data. We agree with Daskalova et al. 2020 that the assembly and analysis of larger datasets is urgently needed, but it will take time until such datasets are available. Thus, short-term datasets like ours are highly valuable, should be extended and analysed continually to provide a more detailed understanding of how insect populations are changing under the influence of global change, and to trigger immediate conservation actions.


10.2196/27098 ◽  
2021 ◽  
Vol 23 (9) ◽  
pp. e27098
Author(s):  
Yi-Shiuan Liu ◽  
Chih-Yu Yang ◽  
Ping-Fang Chiu ◽  
Hui-Chu Lin ◽  
Chung-Chuan Lo ◽  
...  

Background Hemodialysis (HD) therapy is an indispensable tool used in critical care management. Patients undergoing HD are at risk for intradialytic adverse events, ranging from muscle cramps to cardiac arrest. So far, there is no effective HD device–integrated algorithm to assist medical staff in response to these adverse events a step earlier during HD. Objective We aimed to develop machine learning algorithms to predict intradialytic adverse events in an unbiased manner. Methods Three-month dialysis and physiological time-series data were collected from all patients who underwent maintenance HD therapy at a tertiary care referral center. Dialysis data were collected automatically by HD devices, and physiological data were recorded by medical staff. Intradialytic adverse events were documented by medical staff according to patient complaints. Features extracted from the time series data sets by linear and differential analyses were used for machine learning to predict adverse events during HD. Results Time series dialysis data were collected during the 4-hour HD session in 108 patients who underwent maintenance HD therapy. There were a total of 4221 HD sessions, 406 of which involved at least one intradialytic adverse event. Models were built by classification algorithms and evaluated by four-fold cross-validation. The developed algorithm predicted overall intradialytic adverse events, with an area under the curve (AUC) of 0.83, sensitivity of 0.53, and specificity of 0.96. The algorithm also predicted muscle cramps, with an AUC of 0.85, and blood pressure elevation, with an AUC of 0.93. In addition, the model built based on ultrafiltration-unrelated features predicted all types of adverse events, with an AUC of 0.81, indicating that ultrafiltration-unrelated factors also contribute to the onset of adverse events. Conclusions Our results demonstrated that algorithms combining linear and differential analyses with two-class classification machine learning can predict intradialytic adverse events in quasi-real time with high AUCs. Such a methodology implemented with local cloud computation and real-time optimization by personalized HD data could warn clinicians to take timely actions in advance.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yiming Tian ◽  
Takuya Maekawa ◽  
Joseph Korpela ◽  
Daichi Amagata ◽  
Takahiro Hara ◽  
...  

Abstract Background Recent advances in sensing technologies have enabled us to attach small loggers to animals in their natural habitat. It allows measurement of the animals’ behavior, along with associated environmental and physiological data and to unravel the adaptive significance of the behavior. However, because animal-borne loggers can now record multi-dimensional (here defined as multimodal) time series information from a variety of sensors, it is becoming increasingly difficult to identify biologically important patterns hidden in the high-dimensional long-term data. In particular, it is important to identify co-occurrences of several behavioral modes recorded by different sensors in order to understand an internal hidden state of an animal because the observed behavioral modes are reflected by the hidden state. This study proposed a method for automatically detecting co-occurrence of behavioral modes that differs between two groups (e.g., males vs. females) from multimodal time-series sensor data. The proposed method first extracted behavioral modes from time-series data (e.g., resting and cruising modes in GPS trajectories or relaxed and stressed modes in heart rates) and then identified two different behavioral modes that were frequently co-occur (e.g., co-occurrence of the cruising mode and relaxed mode). Finally, behavioral modes that differ between the two groups in terms of the frequency of co-occurrence were identified. Results We demonstrated the effectiveness of our method using animal-locomotion data collected from male and female Streaked Shearwaters by showing co-occurrences of locomotion modes and diving behavior recorded by GPS and water-depth sensors. For example, we found that the behavioral mode of high-speed locomotion and that of multiple dives into the sea were highly correlated in male seabirds. In addition, compared to the naive method, the proposed method reduced the computation costs by about 99.9%. Conclusion Because our method can automatically mine meaningful behavioral modes from multimodal time-series data, it can be potentially applied to analyzing co-occurrences of locomotion modes and behavioral modes from various environmental and physiological data.


2018 ◽  
Vol 10 (10) ◽  
pp. 105
Author(s):  
Schalk Burger ◽  
Searle Silverman ◽  
Gary van Vuuren

The problem of missing data is prevalent in financial time series, particularly data such as foreign exchange rates and interest rate indices. Reasons for missing data include the clo-sure of financial markets over weekends and holidays and that sometimes, index data do not change between consecutive dates, resulting in stale data (also considered as missing data). Most statistical software packages function best when applied to complete da-tasets. Listwise deletion – a commonly-used approach to deal with missing data, is straightforward to use and implement, but it can exclude large portions of the original dataset (Allison, 2002). Where data are randomly missing or if the deleted data are insignificant (measured by statistical power), listwise deletion may add value. Techniques to handle missing data were suggested and implemented. These techniques were assessed to ascertain which provided the most accurate reconstructed datasets compared with complete dataset.


2017 ◽  
Author(s):  
Easton R White

Long-term time series are necessary to better understand population dynamics, assess species' conservation status, and make management decisions. However, population data are often expensive, requiring a lot of time and resources. When is a population time series long enough to address a question of interest? We determine the minimum time series length required to detect significant increases or decreases in population abundance. To address this question, we use simulation methods and examine 878 populations of vertebrate species. Here we show that 15-20 years of continuous monitoring are required in order to achieve a high level of statistical power. For both simulations and the time series data, the minimum time required depends on trend strength, population variability, and temporal autocorrelation. These results point to the importance of sampling populations over long periods of time. We argue that statistical power needs to be considered in monitoring program design and evaluation. Time series less than 15-20 years are likely underpowered and potentially misleading.


2021 ◽  
Author(s):  
Yi-Shiuan Liu ◽  
Chih-Yu Yang ◽  
Ping-Fang Chiu ◽  
Hui-Chu Lin ◽  
Chung-Chuan Lo ◽  
...  

BACKGROUND Hemodialysis (HD) therapy is an indispensable tool used in critical care management. Patients undergoing HD are at risk for intradialytic adverse events, ranging from muscle cramps to cardiac arrest. So far, there is no effective HD device–integrated algorithm to assist medical staff in response to these adverse events a step earlier during HD. OBJECTIVE We aimed to develop machine learning algorithms to predict intradialytic adverse events in an unbiased manner. METHODS Three-month dialysis and physiological time-series data were collected from all patients who underwent maintenance HD therapy at a tertiary care referral center. Dialysis data were collected automatically by HD devices, and physiological data were recorded by medical staff. Intradialytic adverse events were documented by medical staff according to patient complaints. Features extracted from the time series data sets by linear and differential analyses were used for machine learning to predict adverse events during HD. RESULTS Time series dialysis data were collected during the 4-hour HD session in 108 patients who underwent maintenance HD therapy. There were a total of 4221 HD sessions, 406 of which involved at least one intradialytic adverse event. Models were built by classification algorithms and evaluated by four-fold cross-validation. The developed algorithm predicted overall intradialytic adverse events, with an area under the curve (AUC) of 0.83, sensitivity of 0.53, and specificity of 0.96. The algorithm also predicted muscle cramps, with an AUC of 0.85, and blood pressure elevation, with an AUC of 0.93. In addition, the model built based on ultrafiltration-unrelated features predicted all types of adverse events, with an AUC of 0.81, indicating that ultrafiltration-unrelated factors also contribute to the onset of adverse events. CONCLUSIONS Our results demonstrated that algorithms combining linear and differential analyses with two-class classification machine learning can predict intradialytic adverse events in quasi-real time with high AUCs. Such a methodology implemented with local cloud computation and real-time optimization by personalized HD data could warn clinicians to take timely actions in advance.


Author(s):  
Easton R White

Long-term time series are necessary to better understand population dynamics, assess species' conservation status, and make management decisions. However, population data are often expensive, requiring a lot of time and resources. When is a population time series long enough to address a question of interest? I determine the minimum time series length required to detect significant increases or decreases in population abundance. To address this question, I use simulation methods and examine 822 populations of vertebrate species. Here I show that on average 15.9 years of continuous monitoring are required in order to achieve a high level of statistical power. However, there is a wide distribution around this average, casting doubt on simple rules of thumb. For both simulations and the time series data, the minimum time required depends on trend strength, population variability, and temporal autocorrelation. However, there were no life-history traits (e.g. generation length) that were predictive of the minimum time required. These results point to the importance of sampling populations over long periods of time. I argue that statistical power needs to be considered in monitoring program design and evaluation. Short time series are likely under-powered and potentially misleading.


Sign in / Sign up

Export Citation Format

Share Document