A novel water quality data analysis framework based on time-series data mining

2017 ◽  
Vol 196 ◽  
pp. 365-375 ◽  
Author(s):  
Weihui Deng ◽  
Guoyin Wang
2021 ◽  
Vol 2066 (1) ◽  
pp. 012043
Author(s):  
Wei Wang ◽  
Xiaohui Hu ◽  
Mingye Wang ◽  
Yao Du

Abstract With the rapid development of computer technology, Internet technology and artificial intelligence technology, the amount of global data has exploded. However, the single-machine serial mode of traditional data mining cannot be directly transplanted to the cloud platform. Only by parallelizing and improving many classic data mining algorithms can the cloud computing platform and data mining be effectively combined. Therefore, it is of great significance to the research and implementation of parallel algorithm technology for time series data mining. The purpose of this paper is to study the research and implementation of parallel algorithm technology for time series data mining. This paper adopts the method of literature data, mathematical statistics, logic analysis and other research methods to study the parallel algorithm technology research and realization of time series data mining, mainly to make useful explorations of time series data mining and visualization technology. It embodies the design ideas of big data analysis tools, and finally reflects the power and market value of data analysis tools through the display of the platform. Research shows that running in the same data set and the same experimental environment, the improved parallel collaborative filtering algorithm ACF in this paper has higher time running efficiency than the parallel algorithm MCF based on the cooccurrence matrix, and in the case of larger data sets, the more obvious the time difference.


1989 ◽  
Vol 40 (3) ◽  
pp. 241 ◽  
Author(s):  
DR Welsh ◽  
DB Stewart

Intervention analysis is a rigorous statistical modelling technique used to measure the effect of a shift in the mean level of a time series, caused by an intervention. A general formulation of an intervention model is applied to water-quality data for two streams in north-eastern Victoria, measuring the effect of drought on the electrical conductivity of one stream, and the effect of bushfires on the flow and turbidity of the other. The nature of the intervention is revealed using exploratory data-analysis techniques, such as smoothing and boxplots, on the time-series data. Intervention analysis is then used to confirm the identified changes and estimate their magnitude. The increased level of electrical conductivity due to drought is determined by three techniques of estimation and the results compared. The best of these techniques is then used to model changes in stream flow and turbidity following bushfires in the catchment.


Axioms ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 49
Author(s):  
Anton Romanov ◽  
Valeria Voronina ◽  
Gleb Guskov ◽  
Irina Moshkina ◽  
Nadezhda Yarushkina

The development of the economy and the transition to industry 4.0 creates new challenges for artificial intelligence methods. Such challenges include the processing of large volumes of data, the analysis of various dynamic indicators, the discovery of complex dependencies in the accumulated data, and the forecasting of the state of processes. The main point of this study is the development of a set of analytical and prognostic methods. The methods described in this article based on fuzzy logic, statistic, and time series data mining, because data extracted from dynamic systems are initially incomplete and have a high degree of uncertainty. The ultimate goal of the study is to improve the quality of data analysis in industrial and economic systems. The advantages of the proposed methods are flexibility and orientation to the high interpretability of dynamic data. The high level of the interpretability and interoperability of dynamic data is achieved due to a combination of time series data mining and knowledge base engineering methods. The merging of a set of rules extracted from the time series and knowledge base rules allow for making a forecast in case of insufficiency of the length and nature of the time series. The proposed methods are also based on the summarization of the results of processes modeling for diagnosing technical systems, forecasting of the economic condition of enterprises, and approaches to the technological preparation of production in a multi-productive production program with the application of type 2 fuzzy sets for time series modeling. Intelligent systems based on the proposed methods demonstrate an increase in the quality and stability of their functioning. This article contains a set of experiments to approve this statement.


2016 ◽  
Vol 47 (5) ◽  
pp. 1069-1085 ◽  
Author(s):  
Yung-Chia Chiu ◽  
Chih-Wei Chiang ◽  
Tsung-Yu Lee

The adaptive neuro fuzzy inference system (ANFIS) has been proposed to model the time series of water quality data in this study. The biochemical oxygen demand data collected at the upstream catchment of Feitsui Reservoir in Taiwan for more than 20 years are selected as the target water quality variable. The classical statistical technique of the Box-Jenkins method is applied for the selection of appropriate input variables and data pre-processing of using differencing is implemented during the model development. The time series data obtained by ANFIS models are compared to those obtained by autoregressive integrated moving average (ARIMA) and artificial neural networks (ANNs). The results show that the ANFIS model identified at each sampling station is superior to the respective ARIMA and ANN models. The R values at all sampling stations of the training and testing datasets are 0.83–0.98 and 0.81–0.89, respectively, except at Huang-ju-pi-liao station. ANFIS models can provide accurate predictions for complex hydrological processes, and can be extended to other areas to improve the understanding of river pollution trends. The procedure of input selection and the pre-processing of input data proposed in this study can stimulate the usage of ANFIS in other related studies.


Sign in / Sign up

Export Citation Format

Share Document