PhilDB - The time series database with built-in change logging

PhilDB: the time series database with built-in change logging

PeerJ Computer Science ◽

10.7717/peerj-cs.52 ◽

2016 ◽

Vol 2 ◽

pp. e52 ◽

Cited By ~ 1

Author(s):

Andrew MacDonald

Keyword(s):

Time Series ◽

Big Data ◽

Open Source ◽

High Performance ◽

Time Series Data ◽

Handling Time ◽

Series Data ◽

Meta Data ◽

Static Data ◽

Data Tracking

PhilDB is an open-source time series database that supports storage of time series datasets that are dynamic; that is, it records updates to existing values in a log as they occur. PhilDB eases loading of data for the user by utilising an intelligent data write method. It preserves existing values during updates and abstracts the update complexity required to achieve logging of data value changes. It implements fast reads to make it practical to select data for analysis. Recent open-source systems have been developed to indefinitely store long-period high-resolution time series data without change logging. Unfortunately, such systems generally require a large initial installation investment before use because they are designed to operate over a cluster of servers to achieve high-performance writing of static data in real time. In essence, they have a ‘big data’ approach to storage and access. Other open-source projects for handling time series data that avoid the ‘big data’ approach are also relatively new and are complex or incomplete. None of these systems gracefully handle revision of existing data while tracking values that change. Unlike ‘big data’ solutions, PhilDB has been designed for single machine deployment on commodity hardware, reducing the barrier to deployment. PhilDB takes a unique approach to meta-data tracking; optional attribute attachment. This facilitates scaling the complexities of storing a wide variety of data. That is, it allows time series data to be loaded as time series instances with minimal initial meta-data, yet additional attributes can be created and attached to differentiate the time series instances when a wider variety of data is needed. PhilDB was written in Python, leveraging existing libraries. While some existing systems come close to meeting the needs PhilDB addresses, none cover all the needs at once. PhilDB was written to fill this gap in existing solutions. This paper explores existing time series database solutions, discusses the motivation for PhilDB, describes the architecture and philosophy of the PhilDB software, and performs an evaluation between InfluxDB, PhilDB, and SciDB.

Download Full-text

PhilDB: The time series database with built-in change logging

10.7287/peerj.preprints.1488 ◽

2016 ◽

Author(s):

Andrew MacDonald

Keyword(s):

Time Series ◽

Big Data ◽

Open Source ◽

High Performance ◽

Time Series Data ◽

Handling Time ◽

Series Data ◽

Meta Data ◽

Data Tracking ◽

Simple Evaluation

PhilDB is an open-source time series database that supports storage of time series datasets that are dynamic, that is it records updates to existing values in a log as they occur. PhilDB eases loading of data for the user by utilising an intelligent data write method. It preserves existing values during updates and abstracts the update complexity required to achieve logging of data value changes. It implements fast reads to make it practical to select data for analysis. Recent open-source systems have been developed to indefinitely store long-period high-resolution time series data without change logging. Unfortunately such systems generally require a large initial installation investment before use because they are designed to operate over a cluster of servers to achieve high-performance writing of static data in real time. In essence, they have a 'big data' approach to storage and access. Other open-source projects for handling time series data that avoid the 'big data' approach are also relatively new and are complex or incomplete. None of these systems gracefully handle revision of existing data while tracking values that changed. Unlike 'big data' solutions, PhilDB has been designed for single machine deployment on commodity hardware, reducing the barrier to deployment. PhilDB takes a unique approach to meta-data tracking; optional attribute attachment. This facilitates scaling the complexities of storing a wide variety of data. That is, it allows time series data to be loaded as time series instances with minimal initial meta-data, yet additional attributes can be created and attached to differentiate the time series instances when a wider variety of data is needed. PhilDB was written in Python, leveraging existing libraries. While some existing systems come close to meeting the needs PhilDB addresses, none cover all the needs at once. PhilDB was written to fill this gap in existing solutions. This paper explores existing time series database solutions, discusses the motivation for PhilDB, describes the architecture and philosophy of the PhilDB software, and performs a simple evaluation between InfluxDB, PhilDB, and SciDB.

Download Full-text

PhilDB: The time series database with built-in change logging

10.7287/peerj.preprints.1488v2 ◽

2016 ◽

Author(s):

Andrew MacDonald

Keyword(s):

Time Series ◽

Big Data ◽

Open Source ◽

High Performance ◽

Time Series Data ◽

Handling Time ◽

Series Data ◽

Meta Data ◽

Data Tracking ◽

Simple Evaluation

PhilDB is an open-source time series database that supports storage of time series datasets that are dynamic, that is it records updates to existing values in a log as they occur. PhilDB eases loading of data for the user by utilising an intelligent data write method. It preserves existing values during updates and abstracts the update complexity required to achieve logging of data value changes. It implements fast reads to make it practical to select data for analysis. Recent open-source systems have been developed to indefinitely store long-period high-resolution time series data without change logging. Unfortunately such systems generally require a large initial installation investment before use because they are designed to operate over a cluster of servers to achieve high-performance writing of static data in real time. In essence, they have a 'big data' approach to storage and access. Other open-source projects for handling time series data that avoid the 'big data' approach are also relatively new and are complex or incomplete. None of these systems gracefully handle revision of existing data while tracking values that changed. Unlike 'big data' solutions, PhilDB has been designed for single machine deployment on commodity hardware, reducing the barrier to deployment. PhilDB takes a unique approach to meta-data tracking; optional attribute attachment. This facilitates scaling the complexities of storing a wide variety of data. That is, it allows time series data to be loaded as time series instances with minimal initial meta-data, yet additional attributes can be created and attached to differentiate the time series instances when a wider variety of data is needed. PhilDB was written in Python, leveraging existing libraries. While some existing systems come close to meeting the needs PhilDB addresses, none cover all the needs at once. PhilDB was written to fill this gap in existing solutions. This paper explores existing time series database solutions, discusses the motivation for PhilDB, describes the architecture and philosophy of the PhilDB software, and performs a simple evaluation between InfluxDB, PhilDB, and SciDB.

Download Full-text

Feature-Based Online Representation Algorithm for Streaming Time Series Similarity Search

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142050010x ◽

2019 ◽

Vol 34 (05) ◽

pp. 2050010 ◽

Cited By ~ 1

Author(s):

Peng Zhan ◽

Changchang Sun ◽

Yupeng Hu ◽

Wei Luo ◽

Jiecai Zheng ◽

...

Keyword(s):

Time Series ◽

Big Data ◽

Similarity Search ◽

High Speed ◽

High Performance ◽

Time Series Data ◽

Rapid Development ◽

Feature Representation ◽

Series Data ◽

Data Points

With the rapid development of information technology, we have already access to the era of big data. Time series is a sequence of data points associated with numerical values and successive timestamps. Time series not only has the traditional big data features, but also can be continuously generated in a high speed. Therefore, it is very time- and resource-consuming to directly apply the traditional time series similarity search methods on the raw time series data. In this paper, we propose a novel online segmenting algorithm for streaming time series, which has a relatively high performance on feature representation and similarity search. Extensive experimental results on different typical time series datasets have demonstrated the superiority of our method.

Download Full-text

A Topology Based Spatio-Temporal Map Algebra for Big Data Analysis

Data ◽

10.3390/data4020086 ◽

2019 ◽

Vol 4 (2) ◽

pp. 86 ◽

Cited By ~ 5

Author(s):

Sören Gebbert ◽

Thomas Leppelt ◽

Edzer Pebesma

Keyword(s):

Time Series ◽

Big Data ◽

Open Source ◽

Time Series Data ◽

Series Data ◽

Algebraic Expression ◽

Observation Data ◽

Map Algebra ◽

Algebra Approach ◽

Spatio Temporal

Continental and global datasets based on earth observations or computational models challenge the existing map algebra approaches. The available datasets differ in their spatio-temporal extents and their spatio-temporal granularity, which makes it difficult to process them as time series data in map algebra expressions. To address this issue we introduce a new map algebra approach that is topology based. This topology based map algebra uses spatio-temporal topological operators (STTOP and STTCOP) to specify spatio-temporal operations between topological related map layers of different time-series data. We have implemented several topology based map algebra tools in the open source geoinformation system GRASS GIS and its open source cloud processing engine actinia. We demonstrate the application of our topology based map algebra by solving real world big data problems using a single algebraic expression. This included the massively parallel computation of the NDVI from a series of 100 Sentinel2A scenes organized as earth observation data cubes. The processing was performed and benchmarked on a many core computer setup and in a distributed container environment. The design of our topology based map algebra allows us to deploy it as a standardized service in the EU Horizon 2020 project openEO.

Download Full-text

Data Preprocessing Techniques for Handling Time Series data for Environmental Science Studies

International Journal of Engineering Trends and Technology ◽

10.14445/22315381/ijett-v69i5p227 ◽

2021 ◽

Vol 69 (5) ◽

pp. 196-207

Author(s):

Ebin Antony ◽

Sreekanth N S ◽

Sunil Kumar R K ◽

Nishanth T

Keyword(s):

Time Series ◽

Environmental Science ◽

Science Studies ◽

Time Series Data ◽

Handling Time ◽

Data Preprocessing ◽

Series Data

Download Full-text

Open source telecommunications framework for time-series data from smart grid sensors

2015 IEEE 15th International Conference on Environment and Electrical Engineering (EEEIC) ◽

10.1109/eeeic.2015.7165517 ◽

2015 ◽

Author(s):

John C. Hastings ◽

David M. Laverty ◽

D. John Morrow

Keyword(s):

Time Series ◽

Smart Grid ◽

Open Source ◽

Time Series Data ◽

Series Data

Download Full-text

Fundamentals of Handling Time Series Data with R

10.1007/978-981-16-0711-0_3 ◽

2021 ◽

pp. 23-27

Author(s):

Junichiro Hagiwara

Keyword(s):

Time Series ◽

Time Series Data ◽

Handling Time ◽

Series Data

Download Full-text

OSARIS, the “Open Source SAR Investigation System” for Automatized Parallel InSAR Processing of Sentinel-1 Time Series Data With Special Emphasis on Cryosphere Applications

Frontiers in Earth Science ◽

10.3389/feart.2019.00172 ◽

2019 ◽

Vol 7 ◽

Cited By ~ 1

Author(s):

David Loibl ◽

Bodo Bookhagen ◽

Sébastien Valade ◽

Christoph Schneider

Keyword(s):

Time Series ◽

Open Source ◽

Time Series Data ◽

Series Data

Download Full-text

AstroCatR: a mechanism and tool for efficient time series reconstruction of large-scale astronomical catalogues

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa1413 ◽

2020 ◽

Vol 496 (1) ◽

pp. 629-637

Author(s):

Ce Yu ◽

Kun Li ◽

Shanjiang Tang ◽

Chao Sun ◽

Bin Ma ◽

...

Keyword(s):

Time Series ◽

High Performance ◽

Large Scale ◽

Extrasolar Planets ◽

Time Series Data ◽

Series Data ◽

Data Sets ◽

Observation Data ◽

Data Volume ◽

And Performance

ABSTRACT Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analysing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or data bases, match each item to determine which object it belongs to, and finally produce time series data sets. To support the high-performance parallel processing of large-scale data sets, AstroCatR uses the extract-transform-load (ETL) pre-processing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3× faster than methods using relational data base management systems at matching massive catalogues.

Download Full-text