A precipitation forecasting model using machine learning on  big data in clouds environment

Numerical weather prediction (NWP) has long been a difficult task for meteorologists. Atmospheric dynamics is extremely complicated to model, and chaos theory teaches us that the mathematical equations used to predict the weather are sensitive to initial conditions; that is, slightly perturbed initial conditions could yield very different forecasts. Over the years, meteorologists have developed a number of different mathematical models for atmospheric dynamics, each making slightly different assumptions and simplifications, and hence each yielding different forecasts. It has been noted that each model has its strengths and weaknesses forecasting in different situations, and hence to improve performance, scientists now use an ensemble forecast consisting of different models and running those models with different initial conditions. This ensemble method uses statistical post-processing; usually linear regression. Recently, machine learning techniques have started to be applied to NWP. Studies of neural networks, logistic regression, and genetic algorithms have shown improvements over standard linear regression for precipitation prediction. Gagne et al proposed using multiple machine learning techniques to improve precipitation forecasting. They used Breiman’s random forest technique, which had previously been applied to other areas of meteorology. Performance was verified using Next Generation Weather Radar (NEXRAD) data. Instead of using an ensemble forecast, it discusses the usage of techniques pertaining to machine learning to improve the precipitation forecast. This paper is to present an approach for mapping of precipitation data. The project attempts to arrive at a machine learning method which is optimal and data driven for predicting precipitation levels that aids farmers thereby aiming to provide benefits to the agricultural domain.

Download Full-text

Application of multi-linear regression models and machine learning techniques for online voltage stability margin estimation

2010 IREP Symposium Bulk Power System Dynamics and Control - VIII (IREP) ◽

10.1109/irep.2010.5563288 ◽

2010 ◽

Cited By ~ 3

Author(s):

Bruno Leonardi ◽

Venkataramana Ajjarapu ◽

Miodrag Djukanovic ◽

Pei Zhang

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Regression Models ◽

Voltage Stability ◽

Stability Margin ◽

Machine Learning Techniques ◽

Linear Regression Models ◽

Voltage Stability Margin ◽

Learning Techniques ◽

Multi Linear Regression

Download Full-text

Exploring multi-modalities in weather prediction using a univariate graph based on machine learning techniques

10.5194/egusphere-egu21-11747 ◽

2021 ◽

Author(s):

Natacha Galmiche ◽

Nello Blaser ◽

Morten Brun ◽

Helwig Hauser ◽

Thomas Spengler ◽

...

Keyword(s):

Machine Learning ◽

Standard Deviation ◽

Probability Distributions ◽

Weather Prediction ◽

A Priori ◽

Clustering Algorithms ◽

Quantitative Information ◽

Machine Learning Techniques ◽

Topological Data Analysis ◽

Learning Techniques

Probability distributions based on ensemble forecasts are commonly used to assess uncertainty in weather prediction. However, interpreting these distributions is not trivial, especially in the case of multimodality with distinct likely outcomes. The conventional summary employs mean and standard deviation across ensemble members, which works well for unimodal, Gaussian-like distributions. In the case of multimodality this misleads, discarding crucial information.&#160;We aim at combining previously developed clustering algorithms in machine learning and topological data analysis to extract useful information such as the number of clusters in an ensemble. Given the chaotic behaviour of the atmosphere, machine learning techniques can provide relevant results even if no, or very little, a priori information about the data is available. In addition, topological methods that analyse the shape of the data can make results explainable.Given an ensemble of univariate time series, a graph is generated whose edges and vertices represent clusters of members, including additional information for each cluster such as the members belonging to them, their uncertainty, and their relevance according to the graph. In the case of multimodality, this approach provides relevant and quantitative information beyond the commonly used mean and standard deviation approach that helps to further characterise the predictability.

Download Full-text

Controlling a Simulated Robot Using Machine Learning Techniques

ASME 2010 World Conference on Innovative Virtual Reality ◽

10.1115/winvr2010-3705 ◽

2010 ◽

Author(s):

Jonathan Becker ◽

Aveek Purohit ◽

Zheng Sun

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Linear Regression ◽

Pid Controller ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Gaming Environment ◽

Using Data

USARSim group at NIST developed a simulated robot that operated in the Unreal Tournament 3 (UT3) gaming environment. They used a software PID controller to control the robot in UT3 worlds. Unfortunately, the PID controller did not work well, so NIST asked us to develop a better controller using machine learning techniques. In the process, we characterized the software PID controller and the robot’s behavior in UT3 worlds. Using data collected from our simulations, we compared different machine learning techniques including linear regression and reinforcement learning (RL). Finally, we implemented a RL based controller in Matlab and ran it in the UT3 environment via a TCP/IP link between Matlab and UT3.

Download Full-text

Analytical Statistics Techniques of Classification and Regression in Machine Learning

10.5772/intechopen.84922 ◽

2020 ◽

Author(s):

Pramod Kumar ◽

Sameer Ambekar ◽

Manish Kumar ◽

Subarna Roy

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Machine Learning Techniques ◽

Statistical Machine Learning ◽

Building Models ◽

Learning Techniques ◽

Implementation Techniques ◽

Regression Techniques ◽

The Common ◽

Classification And Regression

This chapter aims to introduce the common methods and practices of statistical machine learning techniques. It contains the development of algorithms, applications of algorithms and also the ways by which they learn from the observed data by building models. In turn, these models can be used to predict. Although one assumes that machine learning and statistics are not quite related to each other, it is evident that machine learning and statistics go hand in hand. We observe how the methods used in statistics such as linear regression and classification are made use of in machine learning. We also take a look at the implementation techniques of classification and regression techniques. Although machine learning provides standard libraries to implement tons of algorithms, we take a look on how to tune the algorithms and what parameters of the algorithm or the features of the algorithm affect the performance of the algorithm based on the statistical methods.

Download Full-text

Can machine learning improve the model representation of TKE dissipation rate in the boundary layer for complex terrain?

10.5194/gmd-2020-16 ◽

2020 ◽

Author(s):

Nicola Bodini ◽

Julie K. Lundquist ◽

Mike Optis

Keyword(s):

Machine Learning ◽

Numerical Weather Prediction ◽

Complex Terrain ◽

Dissipation Rate ◽

Prediction Models ◽

Weather Prediction ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Tke Dissipation Rate ◽

Numerical Weather Prediction Models

Abstract. Current turbulence parameterizations in numerical weather prediction models at the mesoscale assume a local equilibrium between production and dissipation of turbulence. As this assumption does not hold at fine horizontal resolutions, improved ways to represent turbulent kinetic energy (TKE) dissipation rate (ε) are needed. Here, we use a 6-week data set of turbulence measurements from 184 sonic anemometers in complex terrain at the Perdigão field campaign to suggest improved representations of dissipation rate. First, we demonstrate that a widely used Mellor, Yamada, Nakanishi, and Niino (MYNN) parameterization of TKE dissipation rate leads to a large inaccuracy and bias in the representation of ε. Next, we assess the potential of machine-learning techniques to predict TKE dissipation rate from a set of atmospheric and terrain-related features. We train and test several machine-learning algorithms using the data at Perdigão, and we find that multivariate polynomial regressions and random forests can eliminate the bias MYNN currently shows in representing ε, while also reducing the average error by up to 30 %. Of all the variables included in the algorithms, TKE is the variable responsible for most of the variability of ε, and a strong positive correlation exists between the two. These results suggest further consideration of machine-learning techniques to enhance parameterizations of turbulence in numerical weather prediction models.

Download Full-text

Can machine learning improve the model representation of turbulent kinetic energy dissipation rate in the boundary layer for complex terrain?

Geoscientific Model Development ◽

10.5194/gmd-13-4271-2020 ◽

2020 ◽

Vol 13 (9) ◽

pp. 4271-4285

Author(s):

Nicola Bodini ◽

Julie K. Lundquist ◽

Mike Optis

Keyword(s):

Machine Learning ◽

Kinetic Energy ◽

Complex Terrain ◽

Dissipation Rate ◽

Prediction Models ◽

Weather Prediction ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Tke Dissipation Rate ◽

Numerical Weather Prediction Models

Abstract. Current turbulence parameterizations in numerical weather prediction models at the mesoscale assume a local equilibrium between production and dissipation of turbulence. As this assumption does not hold at fine horizontal resolutions, improved ways to represent turbulent kinetic energy (TKE) dissipation rate (ϵ) are needed. Here, we use a 6-week data set of turbulence measurements from 184 sonic anemometers in complex terrain at the Perdigão field campaign to suggest improved representations of dissipation rate. First, we demonstrate that the widely used Mellor, Yamada, Nakanishi, and Niino (MYNN) parameterization of TKE dissipation rate leads to a large inaccuracy and bias in the representation of ϵ. Next, we assess the potential of machine-learning techniques to predict TKE dissipation rate from a set of atmospheric and terrain-related features. We train and test several machine-learning algorithms using the data at Perdigão, and we find that the models eliminate the bias MYNN currently shows in representing ϵ, while also reducing the average error by up to almost 40 %. Of all the variables included in the algorithms, TKE is the variable responsible for most of the variability of ϵ, and a strong positive correlation exists between the two. These results suggest further consideration of machine-learning techniques to enhance parameterizations of turbulence in numerical weather prediction models.

Download Full-text

Predicting the Mechanical Power of a New-Style Savonius Wind Turbine Using Machine Learning Techniques and Multiple Linear Regression: Comparative Study

11th International Conference on Theory and Application of Soft Computing, Computing with Words and Perceptions and Artificial Intelligence - ICSCCW-2021 - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-92127-9_44 ◽

2022 ◽

pp. 316-323

Author(s):

Youssef Kassem ◽

Hüseyin Çamur ◽

Mohamed Almojtba Hamid Ali Abdalla

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Comparative Study ◽

Multiple Linear Regression ◽

Wind Turbine ◽

Mechanical Power ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Savonius Wind Turbine

Download Full-text

Oceanographic data reconstruction using machine learning techniques

10.5194/egusphere-egu21-2046 ◽

2021 ◽

Author(s):

Hrvoje Kalinić ◽

Zvonimir Bilokapić ◽

Frano Matić

Keyword(s):

Neural Network ◽

Machine Learning ◽

Linear Regression ◽

Spatial Resolution ◽

Adriatic Sea ◽

Machine Learning Techniques ◽

Wind Data ◽

Average Amplitude ◽

Learning Techniques ◽

Nearest Neighbours

In certain measurement endeavours spatial resolution of the data is restricted, while in others data have poor temporal resolution. Typical example of these scenarios come from geoscience where measurement stations are fixed and scattered sparsely in space which results in poor spatial resolution of acquired data. Thus, we ask if it is possible to use a portion of data as a proxy to estimate the rest of the data using different machine learning techniques. In this study, four supervised machine learning methods are trained on the wind data from the Adriatic Sea and used to reconstruct the missing data. The vector wind data components at 10m height are taken from ERA5 reanalysis model in range from 1981 to 2017 and sampled every 6 hours. Data taken from the northern part of the Adriatic Sea was used to estimate the wind at the southern part of Adriatic. The machine learning models utilized for this task were linear regression, K-nearest neighbours, decision trees and a neural network. As a measure of quality of reconstruction the difference between the true and estimated values of wind data in the southern part of Adriatic was used. The result shows that all four models reconstruct the data few hundred kilometres away with average amplitude error below 1m/s. Linear regression, K-nearest neighbours, decision trees and a neural network show average amplitude reconstruction error of 0.52, 0.91, 0.76 and 0.73, and standard deviation of 1.00, 1.42, 1.23 and 1.17, respectively. This work has been supported by Croatian Science Foundation under the project UIP-2019-04-1737.

Download Full-text

Precipitation forecasting by large-scale climate indices and machine learning techniques

Journal of Arid Land ◽

10.1007/s40333-020-0097-3 ◽

2020 ◽

Vol 12 (5) ◽

pp. 854-864

Author(s):

Mehdi Gholami Rostam ◽

Seyyed Javad Sadatinejad ◽

Arash Malekian

Keyword(s):

Machine Learning ◽

Large Scale ◽

Machine Learning Techniques ◽

Climate Indices ◽

Precipitation Forecasting ◽

Learning Techniques

Download Full-text

Multi-Step Ahead Wind Power Generation Prediction Based on Hybrid Machine Learning Techniques

Energies ◽

10.3390/en11081975 ◽

2018 ◽

Vol 11 (8) ◽

pp. 1975 ◽

Cited By ~ 7

Author(s):

Wei Dong ◽

Qiang Yang ◽

Xinli Fang

Keyword(s):

Machine Learning ◽

Phase Space ◽

Fuzzy Inference ◽

Weather Prediction ◽

Wind Farms ◽

Machine Learning Techniques ◽

Multiple Time ◽

Inference System ◽

Learning Techniques ◽

Benchmark Solutions

Accurate generation prediction at multiple time-steps is of paramount importance for reliable and economical operation of wind farms. This study proposed a novel algorithmic solution using various forms of machine learning techniques in a hybrid manner, including phase space reconstruction (PSR), input variable selection (IVS), K-means clustering and adaptive neuro-fuzzy inference system (ANFIS). The PSR technique transforms the historical time series into a set of phase-space variables combining with the numerical weather prediction (NWP) data to prepare candidate inputs. A minimal redundancy maximal relevance (mRMR) criterion based filtering approach is used to automatically select the optimal input variables for the multi-step ahead prediction. Then, the input instances are divided into a set of subsets using the K-means clustering to train the ANFIS. The ANFIS parameters are further optimized to improve the prediction performance by the use of particle swarm optimization (PSO) algorithm. The proposed solution is extensively evaluated through case studies of two realistic wind farms and the numerical results clearly confirm its effectiveness and improved prediction accuracy compared to benchmark solutions.

Download Full-text