Selection of representative groundwater monitoring wells – A compromise between site characteristics, data history, stakeholder interests and technological limitations

Author(s):  
Johannes Christoph Haas ◽  
Alice Retter ◽  
Steffen Birk ◽  
Christian Griebler

<p>In this presentation we provide a brief overview on the strategic selection of representative groundwater wells and lessons learned.</p><p>The inter-disciplinary project “Integrative Groundwater Assessment”, looks into the effects of extreme hydro-meteorological events on the quantity and the chemical and biological quality of groundwater. Focus is on the Austrian Mur catchment, an area reaching from its alpine spring (~2000 m asl) down to the Slovenian border (~200 m asl). More than 500 state operated groundwater observation wells are available over the 400 km of the river’s course, taking private wells not into account. For state operated wells, time series for water levels are publicly available which allows for simply using <em>all</em> the data i.e. using <em>big data</em> approaches [1, 2, 3] – albeit with some issues [4].</p><p>However, for water quality, such time series rarely exist and if so, they often do not cover all specific parameters one needs, asking for targeted sampling campaigns. The availability of hundreds of wells seems like a benefit. However, the identification of wells that are representative and suitable for sampling regarding both chemical and biological parameters is a challenging task</p><p>In consequence, we went through a multi-step process of planning a sampling campaign that should fulfill the following requirements:</p><ul><li> <p>Coverage of the entire stream section from alpine to lowland regions</p> </li> <li> <p>Coverage of different land uses in the river valley</p> </li> <li> <p>Realization of well transects from the river through the complete local aquifer</p> </li> <li> <p>Wells allow sampling of groundwater for the analysis of physical-chemical and biological parameters</p> </li> <li> <p>Historical data of groundwater quantity and quality aspects are available</p> </li> </ul><p>Assessing the available metadata and taking into account the very helpful advice of stakeholders, already reduced the number of representative wells considerably. In order to obtain a consistent data set, another set of wells had to be dismissed, to allow for the same sampling and monitoring procedures at every location. Finally, out in the field, wells that were found damaged or out of order, led to a further reduction. Thus we ended up with only 45 wells suitable for our specific purposes, <10% of what seemed available at the beginning.</p><p>However, using specific strategies for data analysis as outlined in [3] and [4] and application of a novel groundwater ecological assessment scheme (D-A-C Index [5]) showed that even the substantially reduced number of wells provides a very good coverage of the various regions in the Mur catchment. In a further step, the results from two sampling campaigns and subsequent data analysis will be used to select an even smaller subset of wells where novel multi-parameter spectral dataloggers are going to be installed, enabling us to monitor various quality data in an very high temporal resolution.</p><p><em>References:</em></p><p><em>[1] https://doi.org/10.5194/egusphere-egu2020-8148</em></p><p><em>[2] https://doi.org/10.1016/j.ejrh.2019.100597</em></p><p><em>[3] https://doi.org/10.1007/s12665-018-7469-4</em></p><p><em>[4] Haas et al (2020)</em></p><p><em>Tiny steps towards Big Data - Freud und Leid der Arbeit mit großen Grundwasserdatensätzen.</em></p><p><em><span>Tagungsband 2020. Grundwasser und Flusseinzugsgebiete - Prozesse, Daten und Modelle.</span></em></p><p><em><span>[5] https://doi.org/</span><span>10.1016/j.watres.2019.114902</span></em></p><p> </p>

2017 ◽  
Vol 13 (7) ◽  
pp. 155014771772181 ◽  
Author(s):  
Seok-Woo Jang ◽  
Gye-Young Kim

This article proposes an intelligent monitoring system for semiconductor manufacturing equipment, which determines spec-in or spec-out for a wafer in process, using Internet of Things–based big data analysis. The proposed system consists of three phases: initialization, learning, and prediction in real time. The initialization sets the weights and the effective steps for all parameters of equipment to be monitored. The learning performs a clustering to assign similar patterns to the same class. The patterns consist of a multiple time-series produced by semiconductor manufacturing equipment and an after clean inspection measured by the corresponding tester. We modify the Line, Buzo, and Gray algorithm for classifying the time-series patterns. The modified Line, Buzo, and Gray algorithm outputs a reference model for every cluster. The prediction compares a time-series entered in real time with the reference model using statistical dynamic time warping to find the best matched pattern and then calculates a predicted after clean inspection by combining the measured after clean inspection, the dissimilarity, and the weights. Finally, it determines spec-in or spec-out for the wafer. We will present experimental results that show how the proposed system is applied on the data acquired from semiconductor etching equipment.


2021 ◽  
Vol 105 ◽  
pp. 348-355
Author(s):  
Hou Xiang Liu ◽  
Sheng Han Zhou ◽  
Bang Chen ◽  
Chao Fan Wei ◽  
Wen Bing Chang ◽  
...  

The paper proposed a practice teaching mode by making analysis on Didi data set. There are more and more universities have provided the big data analysis courses with the rapid development and wide application of big data analysis technology. The theoretical knowledge of big data analysis is professional and hard to understand. That may reduce students' interest in learning and learning motivation. And the practice teaching plays an important role between theory learning and application. This paper first introduces the theoretical teaching part of the course, and the theoretical methods involved in the course. Then the practice teaching content of Didi data analysis case was briefly described. And the study selects the related evaluation index to evaluate the teaching effect through questionnaire survey and verify the effectiveness of teaching method. The results show that 78% of students think that practical teaching can greatly improve students' interest in learning, 89% of students think that practical teaching can help them learn theoretical knowledge, 89% of students have basically mastered the method of big data analysis technology introduced in the course, 90% of students think that the teaching method proposed in this paper can greatly improve students' practical ability. The teaching mode is effective, which can improve the learning effect and practical ability of students in data analysis, so as to improve the teaching effect.


2020 ◽  
Vol 35 (2) ◽  
pp. 214-222
Author(s):  
Lisa Cenek ◽  
Liubou Klindziuk ◽  
Cindy Lopez ◽  
Eleanor McCartney ◽  
Blanca Martin Burgos ◽  
...  

Circadian rhythms are daily oscillations in physiology and behavior that can be assessed by recording body temperature, locomotor activity, or bioluminescent reporters, among other measures. These different types of data can vary greatly in waveform, noise characteristics, typical sampling rate, and length of recording. We developed 2 Shiny apps for exploration of these data, enabling visualization and analysis of circadian parameters such as period and phase. Methods include the discrete wavelet transform, sine fitting, the Lomb-Scargle periodogram, autocorrelation, and maximum entropy spectral analysis, giving a sense of how well each method works on each type of data. The apps also provide educational overviews and guidance for these methods, supporting the training of those new to this type of analysis. CIRCADA-E (Circadian App for Data Analysis–Experimental Time Series) allows users to explore a large curated experimental data set with mouse body temperature, locomotor activity, and PER2::LUC rhythms recorded from multiple tissues. CIRCADA-S (Circadian App for Data Analysis–Synthetic Time Series) generates and analyzes time series with user-specified parameters, thereby demonstrating how the accuracy of period and phase estimation depends on the type and level of noise, sampling rate, length of recording, and method. We demonstrate the potential uses of the apps through 2 in silico case studies.


SPE Journal ◽  
2017 ◽  
Vol 23 (03) ◽  
pp. 719-736 ◽  
Author(s):  
Quan Cai ◽  
Wei Yu ◽  
Hwa Chi Liang ◽  
Jenn-Tai Liang ◽  
Suojin Wang ◽  
...  

Summary The oil-and-gas industry is entering an era of “big data” because of the huge number of wells drilled with the rapid development of unconventional oil-and-gas reservoirs during the past decade. The massive amount of data generated presents a great opportunity for the industry to use data-analysis tools to help make informed decisions. The main challenge is the lack of the application of effective and efficient data-analysis tools to analyze and extract useful information for the decision-making process from the enormous amount of data available. In developing tight shale reservoirs, it is critical to have an optimal drilling strategy, thereby minimizing the risk of drilling in areas that would result in low-yield wells. The objective of this study is to develop an effective data-analysis tool capable of dealing with big and complicated data sets to identify hot zones in tight shale reservoirs with the potential to yield highly productive wells. The proposed tool is developed on the basis of nonparametric smoothing models, which are superior to the traditional multiple-linear-regression (MLR) models in both the predictive power and the ability to deal with nonlinear, higher-order variable interactions. This data-analysis tool is capable of handling one response variable and multiple predictor variables. To validate our tool, we used two real data sets—one with 249 tight oil horizontal wells from the Middle Bakken and the other with 2,064 shale gas horizontal wells from the Marcellus Shale. Results from the two case studies revealed that our tool not only can achieve much better predictive power than the traditional MLR models on identifying hot zones in the tight shale reservoirs but also can provide guidance on developing the optimal drilling and completion strategies (e.g., well length and depth, amount of proppant and water injected). By comparing results from the two data sets, we found that our tool can achieve model performance with the big data set (2,064 Marcellus wells) with only four predictor variables that is similar to that with the small data set (249 Bakken wells) with six predictor variables. This implies that, for big data sets, even with a limited number of available predictor variables, our tool can still be very effective in identifying hot zones that would yield highly productive wells. The data sets that we have access to in this study contain very limited completion, geological, and petrophysical information. Results from this study clearly demonstrated that the data-analysis tool is certainly powerful and flexible enough to take advantage of any additional engineering and geology data to allow the operators to gain insights on the impact of these factors on well performance.


2020 ◽  
Vol 8 (6) ◽  
pp. 3704-3708

Big data analytics is a field in which we analyse and process information from large or convoluted data sets to be managed by methods of data-processing. Big data analytics is used in analysing the data and helps in predicting the best outcome from the data sets. Big data analytics can be very useful in predicting crime and also gives the best possible solution to solve that crime. In this system we will be using the past crime data set to find out the pattern and through that pattern we will be predicting the range of the incident. The range of the incident will be determined by the decision model and according to the range the prediction will be made. The data sets will be nonlinear and in the form of time series so in this system we will be using the prophet model algorithm which is used to analyse the non-linear time series data. The prophet model categories in three main category and i.e. trends, seasonality, and holidays. This system will help crime cell to predict the possible incident according to the pattern which will be developed by the algorithm and it also helps to deploy right number of resources to the highly marked area where there is a high chance of incidents to occur. The system will enhance the crime prediction system and will help the crime department to use their resources more efficiently.


Author(s):  
Benedikt Gräler ◽  
Andrea Petroselli ◽  
Salvatore Grimaldi ◽  
Bernard De Baets ◽  
Niko Verhoest

Abstract. Many hydrological studies are devoted to the identification of events that are expected to occur on average within a certain time span. While this topic is well established in the univariate case, recent advances focus on a multivariate characterization of events based on copulas. Following a previous study, we show how the definition of the survival Kendall return period fits into the set of multivariate return periods.Moreover, we preliminary investigate the ability of the multivariate return period definitions to select maximal events from a time series. Starting from a rich simulated data set, we show how similar the selection of events from a data set is. It can be deduced from the study and theoretically underpinned that the strength of correlation in the sample influences the differences between the selection of maximal events.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zheng Liu

Due to the common progress and interdependence of wireless sensor networks and language, Chinese semantic analysis under wireless sensor networks has become more and more important. Although there are many research results on wireless networks and Chinese semantics, there are few researches on the influence and relationship between them. Wireless sensor networks have strong application relevance, and the key technologies that need to be solved are also different for different application backgrounds. In order to reveal the basic laws and development trends of online Chinese semantic behavior expression in the context of wireless sensor networks, this paper adopts big data analysis methods and semantic model analysis methods and constructs semantic analysis models through PLSA method calculations, so that the λ construction process conforms to this research topic. Research the accuracy and applicability of the semantic analysis model. Through word extraction of 1.05 million word data of 1,103 documents on Baidu Tieba, HowNet, and citeulike websites, the data set was integrated into a data set, and the PLSA model was verified with this data set. In addition, through the construction of the wireless sensor network, the semantic analysis results in the expression of Chinese behavior are obtained. The results show that the accuracy of the data set extracted from 1103 documents increases with the increase of the number of documents. Second, after using the PLSA model to perform semantic analysis on the data set, the accuracy of the data set is improved. Compared with traditional semantic analysis, the model and the big data analysis framework have obvious advantages. With the continuous development of Internet big data, the big data methods used to count Chinese semantics are also constantly updated, and their efficiency is constantly improving. These updated semantic analysis models and statistical methods are constantly eliminating the uncertainty of modern online Chinese. The basic laws and development trends of statistical Chinese semantics also provide new application scenarios for online Chinese behavior. It also laid a ladder for subsequent scholars.


Author(s):  
Son Nguyen ◽  
Anthony Park

This chapter compares the performances of multiple Big Data techniques applied for time series forecasting and traditional time series models on three Big Data sets. The traditional time series models, Autoregressive Integrated Moving Average (ARIMA), and exponential smoothing models are used as the baseline models against Big Data analysis methods in the machine learning. These Big Data techniques include regression trees, Support Vector Machines (SVM), Multilayer Perceptrons (MLP), Recurrent Neural Networks (RNN), and long short-term memory neural networks (LSTM). Across three time series data sets used (unemployment rate, bike rentals, and transportation), this study finds that LSTM neural networks performed the best. In conclusion, this study points out that Big Data machine learning algorithms applied in time series can outperform traditional time series models. The computations in this work are done by Python, one of the most popular open-sourced platforms for data science and Big Data analysis.


Sign in / Sign up

Export Citation Format

Share Document