NagareDB: A Resource-Efficient Document-Oriented Time-Series Database

Carlos Garcia Calatrava; Yolanda Becerra Fontal; Fernando M. Cucchietti; Carla Diví Cuesta

doi:10.3390/data6080091

NagareDB: A Resource-Efficient Document-Oriented Time-Series Database

Data ◽

10.3390/data6080091 ◽

2021 ◽

Vol 6 (8) ◽

pp. 91

Author(s):

Carlos Garcia Calatrava ◽

Yolanda Becerra Fontal ◽

Fernando M. Cucchietti ◽

Carla Diví Cuesta

Keyword(s):

Machine Learning ◽

Time Series ◽

Learning Curve ◽

Technological Advance ◽

Time Dependent ◽

Machine Learning Techniques ◽

Detection Time ◽

Learning Techniques ◽

Nosql Database ◽

Device Location

The recent great technological advance has led to a broad proliferation of Monitoring Infrastructures, which typically keep track of specific assets along time, ranging from factory machinery, device location, or even people. Gathering this data has become crucial for a wide number of applications, like exploration dashboards or Machine Learning techniques, such as Anomaly Detection. Time-Series Databases, designed to handle these data, grew in popularity, becoming the fastest-growing database type from 2019. In consequence, keeping track and mastering those rapidly evolving technologies became increasingly difficult. This paper introduces the holistic design approach followed for building NagareDB, a Time-Series database built on top of MongoDB—the most popular NoSQL Database, typically discouraged in the Time-Series scenario. The goal of NagareDB is to ease the access to three of the essential resources needed to building time-dependent systems: Hardware, since it is able to work in commodity machines; Software, as it is built on top of an open-source solution; and Expert Personnel, as its foundation database is considered the most popular NoSQL DB, lowering its learning curve. Concretely, NagareDB is able to outperform MongoDB recommended implementation up to 4.7 times, when retrieving data, while also offering a stream-ingestion up to 35% faster than InfluxDB, the most popular Time-Series database. Moreover, by relaxing some requirements, NagareDB is able to reduce the disk space usage up to 40%.

Predicting time series of railway speed restrictions with time-dependent machine learning techniques

Expert Systems with Applications ◽

10.1016/j.eswa.2013.04.038 ◽

2013 ◽

Vol 40 (15) ◽

pp. 6033-6040 ◽

Cited By ~ 8

Author(s):

Olga Fink ◽

Enrico Zio ◽

Ulrich Weidmann

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Dependent ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Speed Restrictions

Metaheuristic based MLP-SARIMAX HybridizationOne Hour Ahead Solar Radiation Forecasting

10.21528/cbic2021-9 ◽

2021 ◽

Author(s):

Hugo Abreu Mendes ◽

João Fausto Lorenzato Oliveira ◽

Paulo Salgado Gomes Mattos Neto ◽

Alex Coutinho Pereira ◽

Eduardo Boudoux Jatoba ◽

...

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Time Series ◽

Solar Radiation ◽

Statistical Models ◽

Clean Energy ◽

Machine Learning Techniques ◽

Exogenous Variables ◽

Learning Techniques ◽

Photovoltaic Plants

Within the context of clean energy generation, solar radiation forecast is applied for photovoltaic plants to increase maintainability and reliability. Statistical models of time series like ARIMA and machine learning techniques help to improve the results. Hybrid Statistical + ML are found in all sorts of time series forecasting applications. This work presents a new way to automate the SARIMAX modeling, nesting PSO and ACO optimization algorithms, differently from R's AutoARIMA, its searches optimal seasonality parameter and combination of the exogenous variables available. This work presents 2 distinct hybrid models that have MLPs as their main elements, optimizing the architecture with Genetic Algorithm. A methodology was used to obtain the results, which were compared to LSTM, CLSTM, MMFF and NARNN-ARMAX topologies found in recent works. The obtained results for the presented models is promising for use in automatic radiation forecasting systems since it outperformed the compared models on at least two metrics.

Multivariate Time Series Forecasting of Crude Palm Oil Price Using Machine Learning Techniques

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/226/1/012117 ◽

2017 ◽

Vol 226 ◽

pp. 012117 ◽

Cited By ~ 3

Author(s):

Kasturi Kanchymalay ◽

N. Salim ◽

Anupong Sukprasert ◽

Ramesh Krishnan ◽

Ummi Raba’ah Hashim

Keyword(s):

Machine Learning ◽

Time Series ◽

Palm Oil ◽

Multivariate Time Series ◽

Time Series Forecasting ◽

Oil Price ◽

Machine Learning Techniques ◽

Crude Palm Oil ◽

Learning Techniques

Forecasting of Air Passengers using ARIMA Modeling

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1216.09811s19 ◽

2019 ◽

Vol 8 (11S) ◽

pp. 1050-1054

Keyword(s):

Machine Learning ◽

Time Series ◽

Autocorrelation Function ◽

Arima Model ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Forecasting Method ◽

Partial Autocorrelation Function ◽

Time Series Method ◽

Arima Modeling

Air passengers prediction is said to be the centre of gravity of the growth. With people on the move constantly, there is bound to be some dissatisfaction amongst the customers which could be due to various reason, varying from overbooking of flights to ground operations. This dissatisfaction can be controlled till a limit, in ballpark figuring. In the past, this has been done using various machine learning techniques. For this prediction, in this project, ARIMA Modeling is used which is a time series forecasting method, based on machine learning. To test the stationarity of the data, which is done using Dickey Fuller test. If the data is stationary, it is fit into the ARIMA Model. If the data isn’t stationary, it is made stationary by differencing or by logarithmic transformation. The logarithmic method to make the data stationary. Once the data is stationary, using the Partial autocorrelation function and the autocorrelation function, values of p and q are found, which are required in the time series method. These values are then fit into the ARIMA Modeling and hence, the results are predicted. Upon the use and fitting of various models, the ARIMA(2,1,2) has been the best fit, having the least RMS and RMSE values.

A study on leading machine learning techniques for high order fuzzy time series forecasting

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2019.103245 ◽

2020 ◽

Vol 87 ◽

pp. 103245 ◽

Cited By ~ 6

Author(s):

Sibarama Panigrahi ◽

H.S. Behera

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Forecasting ◽

High Order ◽

Fuzzy Time Series ◽

Machine Learning Techniques ◽

Learning Techniques

Applications of Artificial Intelligence in the Realm of Business Intelligence

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch018 ◽

2021 ◽

pp. 358-386

Author(s):

Prakhar Mehrotra

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Time Series ◽

Natural Language Processing ◽

Language Processing ◽

Business Intelligence ◽

Machine Learning Techniques ◽

Current State ◽

Learning Techniques ◽

High Level

The objective of this chapter is to discuss the integration of advancements made in the field of artificial intelligence into the existing business intelligence tools. Specifically, it discusses how the business intelligence tool can integrate time series analysis, supervised and unsupervised machine learning techniques and natural language processing in it and unlock deeper insights, make predictions, and execute strategic business action from within the tool itself. This chapter also provides a high-level overview of current state of the art AI techniques and provides examples in the realm of business intelligence. The eventual goal of this chapter is to leave readers thinking about what the future of business intelligence would look like and how enterprise can benefit by integrating AI in it.

Bitcoin Price Prediction Using Time Series Analysis and Machine Learning Techniques

Machine Learning for Predictive Analysis - Lecture Notes in Networks and Systems ◽

10.1007/978-981-15-7106-0_54 ◽

2020 ◽

pp. 551-560

Author(s):

Aman Gupta ◽

Himanshu Nain

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Analysis ◽

Machine Learning Techniques ◽

Price Prediction ◽

Series Analysis ◽

Learning Techniques

Evaluation of Effectiveness of Time-Series Comments by Using Machine Learning Techniques

Journal of Information Processing ◽

10.2197/ipsjjip.23.784 ◽

2015 ◽

Vol 23 (6) ◽

pp. 784-794 ◽

Cited By ~ 5

Author(s):

Shaymaa E. Sorour ◽

Kazumasa Goda ◽

Tsunenori Mine

Keyword(s):

Machine Learning ◽

Time Series ◽

Machine Learning Techniques ◽

Learning Techniques

A Comparison of Machine Learning Techniques for Modeling River Flow Time Series: The Case of Upper Cauvery River Basin

Water Resources Management ◽

10.1007/s11269-014-0705-0 ◽

2014 ◽

Vol 29 (2) ◽

pp. 589-602 ◽

Cited By ~ 18

Author(s):

Shivshanker Singh Patel ◽

Parthasarathy Ramachandran

Keyword(s):

Machine Learning ◽

Time Series ◽

River Basin ◽

River Flow ◽

Flow Time ◽

Machine Learning Techniques ◽

Cauvery River ◽

Learning Techniques ◽

Cauvery River Basin

Estimating Biomechanical Time-Series with Wearable Sensors: A Systematic Review of Machine Learning Techniques

10.20944/preprints201911.0006.v1 ◽

2019 ◽

Author(s):

Reed D. Gurchiek ◽

Nicholas Cheney ◽

Ryan S. McGinnis

Keyword(s):

Machine Learning ◽

Time Series ◽

Wearable Sensors ◽

Sensor Data ◽

Machine Learning Techniques ◽

Superior Performance ◽

Estimation Accuracy ◽

Accurate Estimation ◽

Practical Implementation ◽

Learning Techniques

Wearable sensors have the potential to enable comprehensive patient characterization and optimized clinical intervention. Critical to realizing this vision is accurate estimation of biomechanical time-series in daily-life, including joint, segment, and muscle kinetics and kinematics, from wearable sensor data. The use of physical models for estimation of these quantities often requires many wearable devices making practical implementation more difficult. However, regression techniques may provide a viable alternative by allowing the use of a reduced number of sensors for estimating biomechanical time-series. Herein, we review 46 articles that used regression algorithms to estimate joint, segment, and muscle kinematics and kinetics. We present a high-level comparison of the many different techniques identified and discuss the implications of our findings concerning practical implementation and further improving estimation accuracy. In particular, we found that several studies report the incorporation of domain knowledge often yielded superior performance. Further, most models were trained on small datasets in which case nonparametric regression often performed best. No models were open-sourced, and most were subject-specific and not validated on impaired populations. Future research should focus on developing open-source algorithms using complementary physics-based and machine learning techniques that are validated in clinically impaired populations. This approach may further improve estimation performance and reduce barriers to clinical adoption.