scholarly journals Advanced Hybrid Machine Learning Algorithms for Multistep Lake Water Level Forecasting

Author(s):  
Elham Fijani ◽  
Khabat Khosravi ◽  
Rahim Barzegar ◽  
John Quilty ◽  
Jan Adamowski ◽  
...  

Abstract Random Tree (RT) and Iterative Classifier Optimizer (ICO) based on Alternating Model Tree (AMT) regressor machine learning (ML) algorithms coupled with Bagging (BA) or Additive Regression (AR) hybrid algorithms were applied to forecasting multistep ahead (up to three months) Lake Superior and Lake Michigan water level (WL). Partial autocorrelation (PACF) of each lake’s WL time series estimated the most important lag times — up to five months in both lakes — as potential inputs. The WL time series data was partitioned into training (from 1918 to 1988) and testing (from 1989 to 2018) for model building and evaluation, respectively. Developed algorithms were validated through statistically and visually based metric using testing data. Although both hybrid ensemble algorithms improved individual ML algorithms’ performance, the BA algorithm outperformed the AR algorithm. As a novel model in forecasting problems, the ICO algorithm was shown to have great potential in generating robust multistep lake WL forecasts.

2021 ◽  
Vol 13 (3) ◽  
pp. 67
Author(s):  
Eric Hitimana ◽  
Gaurav Bajpai ◽  
Richard Musabe ◽  
Louis Sibomana ◽  
Jayavel Kayalvizhi

Many countries worldwide face challenges in controlling building incidence prevention measures for fire disasters. The most critical issues are the localization, identification, detection of the room occupant. Internet of Things (IoT) along with machine learning proved the increase of the smartness of the building by providing real-time data acquisition using sensors and actuators for prediction mechanisms. This paper proposes the implementation of an IoT framework to capture indoor environmental parameters for occupancy multivariate time-series data. The application of the Long Short Term Memory (LSTM) Deep Learning algorithm is used to infer the knowledge of the presence of human beings. An experiment is conducted in an office room using multivariate time-series as predictors in the regression forecasting problem. The results obtained demonstrate that with the developed system it is possible to obtain, process, and store environmental information. The information collected was applied to the LSTM algorithm and compared with other machine learning algorithms. The compared algorithms are Support Vector Machine, Naïve Bayes Network, and Multilayer Perceptron Feed-Forward Network. The outcomes based on the parametric calibrations demonstrate that LSTM performs better in the context of the proposed application.


Author(s):  
Gudipally Chandrashakar

In this article, we used historical time series data up to the current day gold price. In this study of predicting gold price, we consider few correlating factors like silver price, copper price, standard, and poor’s 500 value, dollar-rupee exchange rate, Dow Jones Industrial Average Value. Considering the prices of every correlating factor and gold price data where dates ranging from 2008 January to 2021 February. Few algorithms of machine learning are used to analyze the time-series data are Random Forest Regression, Support Vector Regressor, Linear Regressor, ExtraTrees Regressor and Gradient boosting Regression. While seeing the results the Extra Tree Regressor algorithm gives the predicted value of gold prices more accurately.


2021 ◽  
Author(s):  
Dhairya Vyas

In terms of Machine Learning, the majority of the data can be grouped into four categories: numerical data, category data, time-series data, and text. We use different classifiers for different data properties, such as the Supervised; Unsupervised; and Reinforcement. Each Categorises has classifier we have tested almost all machine learning methods and make analysis among them.


2019 ◽  
Vol 14 ◽  
pp. 155892501988346 ◽  
Author(s):  
Mine Seçkin ◽  
Ahmet Çağdaş Seçkin ◽  
Aysun Coşkun

Although textile production is heavily automation-based, it is viewed as a virgin area with regard to Industry 4.0. When the developments are integrated into the textile sector, efficiency is expected to increase. When data mining and machine learning studies are examined in textile sector, it is seen that there is a lack of data sharing related to production process in enterprises because of commercial concerns and confidentiality. In this study, a method is presented about how to simulate a production process and how to make regression from the time series data with machine learning. The simulation has been prepared for the annual production plan, and the corresponding faults based on the information received from textile glove enterprise and production data have been obtained. Data set has been applied to various machine learning methods within the scope of supervised learning to compare the learning performances. The errors that occur in the production process have been created using random parameters in the simulation. In order to verify the hypothesis that the errors may be forecast, various machine learning algorithms have been trained using data set in the form of time series. The variable showing the number of faulty products could be forecast very successfully. When forecasting the faulty product parameter, the random forest algorithm has demonstrated the highest success. As these error values have given high accuracy even in a simulation that works with uniformly distributed random parameters, highly accurate forecasts can be made in real-life applications as well.


2021 ◽  
Vol 3 ◽  
Author(s):  
Peter Goodin ◽  
Andrew J. Gardner ◽  
Nasim Dokani ◽  
Ben Nizette ◽  
Saeed Ahmadizadeh ◽  
...  

Background: Exposure to thousands of head and body impacts during a career in contact and collision sports may contribute to current or later life issues related to brain health. Wearable technology enables the measurement of impact exposure. The validation of impact detection is required for accurate exposure monitoring. In this study, we present a method of automatic identification (classification) of head and body impacts using an instrumented mouthguard, video-verified impacts, and machine-learning algorithms.Methods: Time series data were collected via the Nexus A9 mouthguard from 60 elite level men (mean age = 26.33; SD = 3.79) and four women (mean age = 25.50; SD = 5.91) from the Australian Rules Football players from eight clubs, participating in 119 games during the 2020 season. Ground truth data labeling on the captures used in this machine learning study was performed through the analysis of game footage by two expert video reviewers using SportCode and Catapult Vision. The visual labeling process occurred independently of the mouthguard time series data. True positive captures (captures where the reviewer directly observed contact between the mouthguard wearer and another player, the ball, or the ground) were defined as hits. Spectral and convolutional kernel based features were extracted from time series data. Performances of untuned classification algorithms from scikit-learn in addition to XGBoost were assessed to select the best performing baseline method for tuning.Results: Based on performance, XGBoost was selected as the classifier algorithm for tuning. A total of 13,712 video verified captures were collected and used to train and validate the classifier. True positive detection ranged from 94.67% in the Test set to 100% in the hold out set. True negatives ranged from 95.65 to 96.83% in the test and rest sets, respectively.Discussion and conclusion: This study suggests the potential for high performing impact classification models to be used for Australian Rules Football and highlights the importance of frequencies <150 Hz for the identification of these impacts.


2021 ◽  
Author(s):  
◽  
Mohammed Ali

In this thesis, we focus on time-series data, which is commonly used by domain experts in different domains to explore and understand phenomena or behaviors under consideration, as-sisting them in making decisions, predicting the future or solving problems. Utilizing sensor devices is one of the common ways of collecting time-series data. These devices collect large volumes of raw data, including multi-dimensional time-series data, and each value is associated with the time-stamp corresponding to when it was recorded. However, finding interesting pat-terns or behaviors in a large amount of data is not simple due to the nature of the data and other challenges related to its size and scalability, high dimensionality, complexity, representation, and unique structure.Researchers tend to use time-series chart visualization, which is usually unsuitable because of the small screen resolution which cannot accommodate the large size of the data. Hence, occlusion and overplotting issues occur, limiting or complicating the exploration and analysis tasks. Another challenge concerns the labeling of patterns in large time-series data, which is time-consuming and requires a great deal of expert knowledge.These issues are addressed in this thesis to improve the exploration, analysis and presen-tation of time-series data and enable users to gain insights into large and multi-dimensional time-series datasets using a combination of dimensionality reduction techniques and interac-tive visual methods. The provided solutions will help researchers from various domains who deal with large and multi-dimensional time-series data to efficiently explore and analyze such data with little effort and in record time.Initially, we explore the area of integration between machine learning algorithms and inter-active visualization techniques for exploring and understanding time-series data, specifically looking at clustering and classification for time-series data in visual analytics. The survey is considered to be a valuable guide for both new researchers and experts in the emerging field of integrating machine learning algorithms into visual analytics.Next, we present a novel approach that aims to explore, analyze, and present large temporal datasets through one image. The proposed approach uses a sliding window and dimensionality reduction techniques to depict a large time-series data as points into a 2D scatter plot. The approach provides novel solutions to many pattern discovery issues and can deal with both univariate and multivariate time-series data.Following this, our proposed approach is combined with both visualization and interaction techniques into one system called TimeCluster, which is a visual analytics tool allowing users to visualize, explore and interact with large time-series data. The system addresses different issues such as anomaly detection, the discovery of frequent patterns, and the labeling of in-teresting patterns in large time-series data all in a single system. We deploy our system with different time-series datasets and report real-world case studies of its utility.Later, the linkage between the 1D view (time-series chart) to the 2D view of the 2D embed-ding of time-series data, and parallel interactions such as selection and labeling, are employed to explore and examine the effectiveness of recent developments in machine learning and di-mension reduction in the context of time-series data exploration. We design a user study to evaluate and validate the effectiveness of the linkage between both a 1D and 2D visualization, and how their fitness in the context of projecting time-series data is, where different dimen-sionality reduction techniques are examined, evaluated and compared within our experimental setting.Lastly, we conclude our findings and outline possible areas for future work.


2020 ◽  
Author(s):  
Atika Qazi ◽  
Khulla Naseer ◽  
Javaria Qazi ◽  
Muhammad Abo

UNSTRUCTURED Well-timed forecast of infectious outbreaks using time-series data can help in proper planning of public health measures. If the forecasts are generated from machine learning algorithms, they can be used to manage resources where most needed. Here we present a support vector machine (SVM) model using epidemiological data provided by Johns Hopkins University Centre for Systems Science and Engineering (JHU CCSE), world health organization (WHO), Center for Disease Control and Prevention (CDC) to predict upcoming data before official declaration by WHO. Our study conducted on the time series data available from 22nd January till 10th March 2020 reveals that COVID-19 was spreading at an alarming rate and progressing towards a pandemic. If machine learning algorithms are used to predict the dynamics of an infectious outbreak future strategies can help in better management. Besides exploratory data analysis (EDA) highlights the importance of quarantine measures taken at the onset of this endemic by China and world leadership in containing the initial COVID-19 transmission. Nevertheless, when quarantine measures were relaxed due to extreme scrutiny a sharp upsurge was seen in COVID-19 transmission. The initial insight that confirmed COVID-19 cases are increasing as these got the highest number of effects for our selected dataset from 22nd January-10th March 2020 i.e. 126,344 (64%). The recovered cases are 68289 (34%) and the death rate is around 2%. The model presented here is flexible and can include uncertainty about outbreak dynamics and can be a significant tool for combating future outbreaks.


2019 ◽  
Vol 11 (16) ◽  
pp. 1899 ◽  
Author(s):  
Katsuto Shimizu ◽  
Tetsuji Ota ◽  
Nobuya Mizoue

The accurate and timely detection of forest disturbances can provide valuable information for effective forest management. Combining dense time series observations from optical and synthetic aperture radar satellites has the potential to improve large-area forest monitoring. For various disturbances, machine learning algorithms might accurately characterize forest changes. However, there is limited knowledge especially on the use of machine learning algorithms to detect forest disturbances through hybrid approaches that combine different data sources. This study investigated the use of dense Landsat 8 and Sentinel-1 time series data for detecting disturbances in tropical seasonal forests based on a machine learning algorithm. The random forest algorithm was used to predict the disturbance probability of each Landsat 8 and Sentinel-1 observation using variables derived from a harmonic regression model, which characterized seasonality and disturbance-related changes. The time series disturbance probabilities of both sensors were then combined to detect forest disturbances in each pixel. The results showed that the combination of Landsat 8 and Sentinel-1 achieved an overall accuracy of 83.6% for disturbance detection, which was higher than the disturbance detection using only Landsat 8 (78.3%) or Sentinel-1 (75.5%). Additionally, more timely disturbance detection was achieved by combining Landsat 8 and Sentinel-1. Small-scale disturbances caused by logging led to large omissions of disturbances; however, other disturbances were detected with relatively high accuracy. Although disturbance detection using only Sentinel-1 data had low accuracy in this study, the combination with Landsat 8 data improved the accuracy of detection, indicating the value of dense Landsat 8 and Sentinel-1 time series data for timely and accurate disturbance detection.


The stock market has been one of the primary revenue streams for many for years. The stock market is often incalculable and uncertain; therefore predicting the ups and downs of the stock market is an uphill task even for the financial experts, which they been trying to tackle without any little success. But it is now possible to predict stock markets due to rapid improvement in technology which led to better processing speed and more accurate algorithms. It is necessary to forswear the misconception that prediction of stock market is only meant for people who have expertise in finance; hence an application can be developed to guide the user about the tempo of the stock market and risk associated with it.The prediction of prices in stock market is a complicated task, and there are various techniques that are used to solve the problem, this paper investigates some of these techniques and compares the accuracy of each of the methods. Forecasting the time series data is important topic in many economics, statistics, finance and business. Of the many techniques in forecasting time series data such as the Autoregressive, Moving Average, and the Autoregressive Integrated Moving Average, it is the Autoregressive Integrated Moving Average that has higher accuracy and higher precision than other methods. And with recent advancement in computational power of processors and advancement in knowledge of machine learning techniques and deep learning, new algorithms could be made to tackle the problem of predicting the stock market. This paper investigates one of such machine learning algorithms to forecast time series data such as Long Short Term Memory. It is compared with traditional algorithms such as the ARIMA method, to determine how superior the LSTM is compared to the traditional methods for predicting the stock market.


Sign in / Sign up

Export Citation Format

Share Document