Extending HydroShare to enable hydrologic time series data as social media

The Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) hydrologic information system (HIS) is a widely used service oriented system for time series data management. While this system is intended to empower the hydrologic sciences community with better data storage and distribution, it lacks support for the kind of ‘Web 2.0’ collaboration and social-networking capabilities being used in other fields. This paper presents the design, development, and testing of a software extension of CUAHSI's newest product, HydroShare. The extension integrates the existing CUAHSI HIS into HydroShare's social hydrology architecture. With this extension, HydroShare provides integrated HIS time series with efficient archiving, discovery, and retrieval of the data, extensive creator and science metadata, scientific discussion and collaboration around the data and other basic social media features. HydroShare provides functionality for online social interaction and collaboration while the existing HIS provides the distributed data management and web services framework. The extension is expected to enable scientists to access and share both national- and laboratory-scale hydrologic time series datasets in a standards-based web services architecture combined with social media functionality developed specifically for the hydrologic sciences.

Download Full-text

Hydrostats: A Python Package for Characterizing Errors between Observed and Predicted Time Series

Hydrology ◽

10.3390/hydrology5040066 ◽

2018 ◽

Vol 5 (4) ◽

pp. 66 ◽

Cited By ~ 7

Author(s):

Wade Roberts ◽

Gustavious P. Williams ◽

Elise Jackson ◽

E. James Nelson ◽

Daniel P. Ames

Keyword(s):

Time Series ◽

Data Storage ◽

Time Series Data ◽

Series Data ◽

Hydrologic Models ◽

Storage And Retrieval ◽

Hydrologic Time Series ◽

Error Metrics ◽

Skill Scores ◽

Python Package

Hydrologists use a number of tools to compare model results to observed flows. These include tools to pre-process the data, data frames to store and access data, visualization and plotting routines, error metrics for single realizations, and ensemble metrics for stochastic realizations to calibrate and evaluate hydrologic models. We present an open-source Python package to help characterize predicted and observed hydrologic time series data called hydrostats which has three main capabilities: Data storage and retrieval based on the Python Data Analysis Library (pandas), visualization and plotting routines using Matplotlib, and a metrics library that currently contains routines to compute over 70 different error metrics and routines for ensemble forecast skill scores. Hydrostats data storage and retrieval functions allow hydrologists to easily compare all, or portions of, a time series. For example, it makes it easy to compare observed and modeled data only during April over a 30-year period. The package includes literature references, explanations, examples, and source code. In this note, we introduce the hydrostats package, provide short examples of the various capabilities, and provide some background on programming issues and practices. The hydrostats package provides a range of tools to make characterizing and analyzing model data easy and efficient. The electronic supplement provides working hydrostats examples.

Download Full-text

Integration of US Army Corps of Engineers' Time-Series Data Management System with Continuous SWMM Modeling.

Advances in Modeling the Management of Stormwater Impacts ◽

10.1201/9781003208945-2 ◽

2021 ◽

pp. 23-44

Author(s):

Yiwen Wang ◽

William James

Keyword(s):

Time Series ◽

Data Management ◽

Management System ◽

Time Series Data ◽

Data Management System ◽

Series Data ◽

Army Corps ◽

Army Corps Of Engineers ◽

Corps Of Engineers ◽

Us Army

Download Full-text

A time series data management framework

International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II ◽

10.1109/itcc.2005.45 ◽

2005 ◽

Author(s):

A. Matus-Castillejos ◽

R. Jentzsch

Keyword(s):

Time Series ◽

Data Management ◽

Time Series Data ◽

Series Data ◽

Management Framework

Download Full-text

Use of hydrologic time series data for identification of recharge mechanism in a fractured bedrock aquifer system

Journal of Hydrology ◽

10.1016/s0022-1694(00)00158-x ◽

2000 ◽

Vol 229 (3-4) ◽

pp. 190-201 ◽

Cited By ~ 144

Author(s):

Jin-Yong Lee ◽

Kang-Kun Lee

Keyword(s):

Time Series ◽

Time Series Data ◽

Series Data ◽

Aquifer System ◽

Fractured Bedrock ◽

Hydrologic Time Series ◽

Fractured Bedrock Aquifer ◽

Bedrock Aquifer

Download Full-text

Exploiting SeaDataCloud Temperature and Salinity time series data collections and comparing with Copernicus - a novel approach with SOURCE tool

10.5194/egusphere-egu2020-17071 ◽

2020 ◽

Author(s):

Paolo Oliveri ◽

SImona Simoncelli ◽

Pierluigi DI Pietro ◽

Sara Durante

Keyword(s):

Time Series ◽

Best Practices ◽

Data Management ◽

Time Series Data ◽

Management Plan ◽

Global Ocean ◽

Series Data ◽

Data Repository ◽

Log File ◽

Data Collections

One of the main challenges for the present and future in ocean observations is to find best practices for data management: infrastructures like Copernicus and SeaDataCloud already take responsibility for assembly, archive, update and publish data. Here we present the strengths and weaknesses in a SeaDataCloud Temperature and Salinity time series data collections, in particular a tool able to recognize the different devices and platforms and to merge them with processed Copernicus platforms.While Copernicus has the main target to quickly acquire and publish data, SeaDataNet aims to publish data with the best quality available. This two data repository should be considered together, since the originator can ingest the data in both the infrastructures or only in one, or partially in both. This results sometimes in data partially available in Copernicus or SeaDataCloud, with great impact for the researcher who wants to access as much data as possible. The data reprocessing should not be loaded on researchers' shoulders, since only skilled users in all data management plan know how merge the data.The SeaDataCloud time series data collections is a Global Ocean soon-to-be-published dataset that will represent a reference for ocean researchers, released in binary, user friendly Ocean Data View format. The database management plan was originally for profiles, but had been adapted for time series, resolving several issues like the uniqueness of the identifiers (ID).Here we present an extension of the SOURCE (Sea Observations Utility for Reprocessing. Calibration and Evaluation) Python package, able to enhance the data quality with redundant sophisticated methods and simplify their usage.&#160;SOURCE increases quality control (Q/C) performances on observations using statistical quality check procedures that follows the ocean best practices guidelines, exploiting the following&#160; issues:<ol><li>Find and aggregate all broken time series using likeness in ID parameter strings;</li> <li>Find and organize in a dictionary all different metadata variables;</li> <li>Correct time series time to match simpler measure units;</li> <li>Filter devices that are outside of a selected horizontal rectangle;</li> <li>Give some information on original Q/C scheme by SeaDataCloud infrastructure;</li> <li>Give information tables on platforms and on the merged ID string duplicates together with an errors log file (missing time, depth, data, wrong Q/C variables, etc.).</li> </ol>In particular, the duplicates table and the log file may be helpful to SeaDataCloud partners in order to update the data collection and make it finally available for the users.The reconstructed SeaDataCloud time series data, divided by parameter and stored in a more flexible dataset, give the possibility to ingest it in the main part of the software, allowing to compare it with Copernicus time series, find the same platform using horizontal and vertical surroundings (without looking to ID) find and cleanup&#160; duplicated data, merge the two databases to extend the data coverage.This allow researchers to have the most wide and the best quality possible data for the final users release and to to use these data to calibrate and validate models, in order to reach an idea of a whole area sea conditions.

Download Full-text

Changes in public response associated with various COVID-19 restrictions in Ontario, Canada: an observational study using social media time series data (Preprint)

JMIR Public Health and Surveillance ◽

10.2196/28716 ◽

2021 ◽

Author(s):

Antony Chum ◽

Andrew Nielsen ◽

Zachary Bellows ◽

Eddie Farrell ◽

Pierre-Nicolas Durette ◽

...

Keyword(s):

Social Media ◽

Time Series ◽

Observational Study ◽

Time Series Data ◽

Series Data ◽

Public Response

Download Full-text

HYDROLOGIC TIME SERIES DATA MODELING USING MULTIPLICATIVE ARIMA

Advances in Geosciences ◽

10.1142/9789812836144_0008 ◽

2010 ◽

pp. 81-94

Author(s):

CHAKKRAPONG TAEWICHIT ◽

SUWATANA CHITTALADAKORN

Keyword(s):

Time Series ◽

Time Series Data ◽

Data Modeling ◽

Series Data ◽

Hydrologic Time Series

Download Full-text

Tweets, tents, and events: The interplay between street protests and social media

10.31219/osf.io/nrfbd ◽

2017 ◽

Author(s):

Marco T. Bastos ◽

Dan Mercea ◽

Arthur Charpentier

Keyword(s):

Social Media ◽

Time Series ◽

Granger Causality ◽

Time Series Data ◽

Series Data ◽

Political Unrest ◽

Protest Activity

Recent protests have fuelled deliberations about the extent to which social media ignites popular uprisings. In this paper we use time-series data of Twitter, Facebook, and onsite protests to assess the Granger-causality between social media streams and onsite developments at the Indignados, Occupy, and Brazilian Vinegar protests. After applying a Gaussianization procedure to the data, we found that contentious communication on Twitter and Facebook forecasted onsite protest during the Indignados and Occupy protests, with bidirectional Granger-causality between online and onsite protest in the Occupy series. Conversely, the Vinegar demonstrations presented Granger-causality between Facebook and Twitter communication, and separately between protestors and injuries/arrests onsite. We conclude that the effective forecasting of protest activity likely varies across different instances of political unrest.

Download Full-text

An installation guide to the PC-based time-series data-management and plotting program BOB

Open-File Report ◽

10.3133/ofr90634a ◽

1990 ◽

Cited By ~ 1

Author(s):

Thomas L. Murray

Keyword(s):

Time Series ◽

Data Management ◽

Time Series Data ◽

Series Data

Download Full-text

Exploring Relationships between Tweet Numbers and Over-The-Counter Drug Sales for Allergic Rhinitis: Retrospective Analysis (Preprint)

10.2196/preprints.33941 ◽

2021 ◽

Author(s):

Shoko Wakamiya ◽

Osamu Morimoto ◽

Katsuhiro Omichi ◽

Hideyuki Hara ◽

Ichiro Kawase ◽

...

Keyword(s):

Social Media ◽

Time Series ◽

Allergic Rhinitis ◽

Time Series Data ◽

Series Data ◽

Sales Volume ◽

Social Media Data ◽

Drug Sales ◽

Number Of Patients ◽

Media Data

BACKGROUND Health-related social media data are increasingly being used in disease surveillance studies. In particular, surveillance of infectious diseases such as influenza has demonstrated high correlations between the number of social media posts mentioning the disease and the number of patients who went to the hospital and were diagnosed with the disease. However, the prevalence of some diseases, such as allergic rhinitis, cannot be estimated based on the number of patients alone. Specifically, patients with allergic rhinitis self-medicate by taking over-the-counter (OTC) medications without going to the hospital. Although allergic rhinitis is not a life-threatening disease, it is a major social problem because it reduces patients’ quality of life, making it essential to understand its prevalence and the motives for self-medication behavior. OBJECTIVE To help understand the prevalence of allergic rhinitis and the motives for self-care treatment using social media data, this study investigated the relationship between the number of social media posts mentioning the main symptoms of allergic rhinitis and the sales volume of OTC rhinitis medications in Japan. METHODS We collected tweets over four years from 2017 to 2020 that included keywords corresponding to the main nasal symptoms of allergic rhinitis: “sneezing,” “runny nose,” and “stuffy nose.” We also obtained the sales volume of OTC drugs, including oral medications and nasal sprays, for the same period. We then calculated the Pearson correlation coefficient between time series data on the number of tweets per week and time series data on the sales volume of OTC drugs per week. RESULTS The results showed a much higher correlation (0.8432) between the time series data on the number of tweets mentioning “stuffy nose” and the time series data on the sales volume of nasal sprays than for the other two symptoms. There was also a high correlation (0.9317) between the seasonal components of these time series data. CONCLUSIONS We investigated the relationships between social media data and behavioral patterns, such as OTC drug sales volume. Exploring these relationships would be useful as a marketing indicator to predict sales volume using social media data. In future, in-depth investigations are required to cover other diseases and countries. We investigated the relationships between social media data and behavioral patterns, such as OTC drug sales volume. Exploring these relationships would be useful as a marketing indicator to predict sales volume using social media data. In future, in-depth investigations are required to cover other diseases and countries.

Download Full-text