Improving Forecast Accuracy with an Auto Machine Learning Post-Correction Technique in Northern Xinjiang

Reliable meteorological forecasts of temperature and relative humidity are critically important to take necessary measures to avoid potential damage and losses. An operational meteorological forecast model based on the Weather Research and Forecast (WRF) model has been built in Xinjiang. Numerical forecasts usually have significant uncertainties and errors due to imperfections in models themselves. In this study, a straightforward automated machine learning (AutoML) approach has been developed to post-process the raw forecasts of the WRF model. The method was implemented and evaluated to post-process forecasts from 13 stations in northern Xinjiang. The post-processed temperature forecasts were significantly improved from the raw forecasts, with average RMSE values in the 13 stations decreasing from 3.24 °C to 2.34 °C by a large margin of 28%. As for relative humidity, the mean RMSE at 13 stations decreased from 19.54% to 11.54%, or it showed a percentage decrease of 41%. Meanwhile, biases were also significantly decreased, with average ME values being reduced from around 2 °C to ~0.33 °C for temperature and improved from −15.6% to ~0% for relative humidity. Moreover, forecast performance values after post-correction became much closer to each other than raw forecast performance values, improving forecast applicability at regional scales.

Download Full-text

The Value of Meteorological Data in Optimizing the Pattern of Physical Load—A Forecast Model of Rowing Pacing Strategy

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph19010320 ◽

2021 ◽

Vol 19 (1) ◽

pp. 320

Author(s):

Tian Yan ◽

Xiaodong Zhu ◽

Xuesong Ding ◽

Liming Chen

Keyword(s):

Meteorological Data ◽

Great Influence ◽

Forecast Accuracy ◽

Forecast Model ◽

Physical Load ◽

Forecast Performance ◽

Outdoor Sports ◽

Pacing Strategy ◽

Different Temperatures ◽

Meteorological Features

Mastering the information of arena environment is the premise for athletes to optimize their patterns of physical load. Therefore, improving the forecast accuracy of the arena conditions is an urgent task in competitive sports. This paper excavates the meteorological features that have great influence on outdoor events such as rowing and their influence on the pacing strategy. We selected the meteorological data of Tokyo from 1979 to 2020 to forecast the meteorology during the Tokyo 2021 Olympic Games, analyzed the athletes’ pacing choice under different temperatures, humidity and sports levels, and then recommend the best pacing strategy for rowing teams of China. The model proposed in this paper complements the absence of meteorological features in the arena environment assessment and provides an algorithm basis for improving the forecast performance of pacing strategies in outdoor sports.

Download Full-text

Forecasting US movies box office performances in Turkey using machine learning algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189120 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6579-6590

Author(s):

Sandy Çağlıyor ◽

Başar Öztayşi ◽

Selime Sezgin

Keyword(s):

Machine Learning ◽

Global Economy ◽

Learning Algorithms ◽

Forecast Model ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

High Stakes ◽

Box Office ◽

Industry Forecast ◽

The Impact

The motion picture industry is one of the largest industries worldwide and has significant importance in the global economy. Considering the high stakes and high risks in the industry, forecast models and decision support systems are gaining importance. Several attempts have been made to estimate the theatrical performance of a movie before or at the early stages of its release. Nevertheless, these models are mostly used for predicting domestic performances and the industry still struggles to predict box office performances in overseas markets. In this study, the aim is to design a forecast model using different machine learning algorithms to estimate the theatrical success of US movies in Turkey. From various sources, a dataset of 1559 movies is constructed. Firstly, independent variables are grouped as pre-release, distributor type, and international distribution based on their characteristic. The number of attendances is discretized into three classes. Four popular machine learning algorithms, artificial neural networks, decision tree regression and gradient boosting tree and random forest are employed, and the impact of each group is observed by compared by the performance models. Then the number of target classes is increased into five and eight and results are compared with the previously developed models in the literature.

Download Full-text

The Development of a Quantitative Precipitation Forecast Correction Technique Based on Machine Learning for Hydrological Applications

Atmosphere ◽

10.3390/atmos11010111 ◽

2020 ◽

Vol 11 (1) ◽

pp. 111 ◽

Cited By ~ 2

Author(s):

Chul-Min Ko ◽

Yeong Yun Jeong ◽

Young-Mi Lee ◽

Byung-Sik Kim

Keyword(s):

Machine Learning ◽

Heavy Rainfall ◽

Extreme Rainfall ◽

Machine Learning Techniques ◽

Precipitation Forecast ◽

Machine Learning Technique ◽

Rainfall Forecast ◽

Quantitative Precipitation Forecast ◽

Correction Technique ◽

Learning Technique

This study aimed to enhance the accuracy of extreme rainfall forecast, using a machine learning technique for forecasting hydrological impact. In this study, machine learning with XGBoost technique was applied for correcting the quantitative precipitation forecast (QPF) provided by the Korea Meteorological Administration (KMA) to develop a hydrological quantitative precipitation forecast (HQPF) for flood inundation modeling. The performance of machine learning techniques for HQPF production was evaluated with a focus on two cases: one for heavy rainfall events in Seoul and the other for heavy rainfall accompanied by Typhoon Kong-rey (1825). This study calculated the well-known statistical metrics to compare the error derived from QPF-based rainfall and HQPF-based rainfall against the observational data from the four sites. For the heavy rainfall case in Seoul, the mean absolute errors (MAE) of the four sites, i.e., Nowon, Jungnang, Dobong, and Gangnam, were 18.6 mm/3 h, 19.4 mm/3 h, 48.7 mm/3 h, and 19.1 mm/3 h for QPF and 13.6 mm/3 h, 14.2 mm/3 h, 33.3 mm/3 h, and 12.0 mm/3 h for HQPF, respectively. These results clearly indicate that the machine learning technique is able to improve the forecasting performance for localized rainfall. In addition, the HQPF-based rainfall shows better performance in capturing the peak rainfall amount and spatial pattern. Therefore, it is considered that the HQPF can be helpful to improve the accuracy of intense rainfall forecast, which is subsequently beneficial for forecasting floods and their hydrological impacts.

Download Full-text

Prediction of Pest Insect Appearance Using Sensors and Machine Learning

Sensors ◽

10.3390/s21144846 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4846

Author(s):

Dušan Marković ◽

Dejan Vujičić ◽

Snežana Tanasković ◽

Borislav Đorđević ◽

Siniša Ranđić ◽

...

Keyword(s):

Machine Learning ◽

Relative Humidity ◽

Weather Conditions ◽

Daily Basis ◽

Machine Learning Algorithms ◽

Lower Percentage ◽

Timely Manner ◽

Proposed Model ◽

Set Up ◽

Accuracy Of Prediction

The appearance of pest insects can lead to a loss in yield if farmers do not respond in a timely manner to suppress their spread. Occurrences and numbers of insects can be monitored through insect traps, which include their permanent touring and checking of their condition. Another more efficient way is to set up sensor devices with a camera at the traps that will photograph the traps and forward the images to the Internet, where the pest insect’s appearance will be predicted by image analysis. Weather conditions, temperature and relative humidity are the parameters that affect the appearance of some pests, such as Helicoverpa armigera. This paper presents a model of machine learning that can predict the appearance of insects during a season on a daily basis, taking into account the air temperature and relative humidity. Several machine learning algorithms for classification were applied and their accuracy for the prediction of insect occurrence was presented (up to 76.5%). Since the data used for testing were given in chronological order according to the days when the measurement was performed, the existing model was expanded to take into account the periods of three and five days. The extended method showed better accuracy of prediction and a lower percentage of false detections. In the case of a period of five days, the accuracy of the affected detections was 86.3%, while the percentage of false detections was 11%. The proposed model of machine learning can help farmers to detect the occurrence of pests and save the time and resources needed to check the fields.

Download Full-text

Developing Relative Humidity and Temperature Corrections for Low-Cost Sensors Using Machine Learning

Sensors ◽

10.3390/s21103338 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3338

Author(s):

Ivan Vajs ◽

Dejan Drajic ◽

Nenad Gligoric ◽

Ilija Radovanovic ◽

Ivan Popovic

Keyword(s):

Machine Learning ◽

Air Quality ◽

Relative Humidity ◽

Low Cost ◽

Quality Monitoring ◽

Air Quality Monitoring ◽

Lower Accuracy ◽

Wide Range ◽

The Impact ◽

Monitoring Stations

Existing government air quality monitoring networks consist of static measurement stations, which are highly reliable and accurately measure a wide range of air pollutants, but they are very large, expensive and require significant amounts of maintenance. As a promising solution, low-cost sensors are being introduced as complementary, air quality monitoring stations. These sensors are, however, not reliable due to the lower accuracy, short life cycle and corresponding calibration issues. Recent studies have shown that low-cost sensors are affected by relative humidity and temperature. In this paper, we explore methods to additionally improve the calibration algorithms with the aim to increase the measurement accuracy considering the impact of temperature and humidity on the readings, by using machine learning. A detailed comparative analysis of linear regression, artificial neural network and random forest algorithms are presented, analyzing their performance on the measurements of CO, NO2 and PM10 particles, with promising results and an achieved R2 of 0.93–0.97, 0.82–0.94 and 0.73–0.89 dependent on the observed period of the year, respectively, for each pollutant. A comprehensive analysis and recommendations on how low-cost sensors could be used as complementary monitoring stations to the reference ones, to increase spatial and temporal measurement resolution, is provided.

Download Full-text

Developing a dengue forecast model using machine learning: A case study in China

PLoS Neglected Tropical Diseases ◽

10.1371/journal.pntd.0005973 ◽

2017 ◽

Vol 11 (10) ◽

pp. e0005973 ◽

Cited By ~ 57

Author(s):

Pi Guo ◽

Tao Liu ◽

Qin Zhang ◽

Li Wang ◽

Jianpeng Xiao ◽

...

Keyword(s):

Machine Learning ◽

Forecast Model

Download Full-text

Intercomparison of Spatial Forecast Verification Methods: Identifying Skillful Spatial Scales Using the Fractions Skill Score

Weather and Forecasting ◽

10.1175/2009waf2222260.1 ◽

2010 ◽

Vol 25 (1) ◽

pp. 343-354 ◽

Cited By ~ 100

Author(s):

Marion Mittermaier ◽

Nigel Roberts

Keyword(s):

Spatial Scales ◽

Wrf Model ◽

Skill Score ◽

Area Ratio ◽

Forecast Verification ◽

Care Needs ◽

Forecast Performance ◽

Verification Methods ◽

Upper Level ◽

Formed Part

Abstract The fractions skill score (FSS) was one of the measures that formed part of the Intercomparison of Spatial Forecast Verification Methods project. The FSS was used to assess a common dataset that consisted of real and perturbed Weather Research and Forecasting (WRF) model precipitation forecasts, as well as geometric cases. These datasets are all based on the NCEP 240 grid, which translates to approximately 4-km resolution over the contiguous United States. The geometric cases showed that the FSS can provide a truthful assessment of displacement errors and forecast skill. In addition, the FSS can be used to determine the scale at which an acceptable level of skill is reached and this usage is perhaps more helpful than interpreting the actual FSS value. This spatial-scale approach is becoming more popular for monitoring operational forecast performance. The study also shows how the FSS responds to forecast bias. A more biased forecast always gives lower FSS values at large scales and usually at smaller scales. It is possible, however, for a more biased forecast to give a higher score at smaller scales, when additional rain overlaps the observed rain. However, given a sufficiently large sample of forecasts, a more biased forecast system will score lower. The use of percentile thresholds can remove the impacts of the bias. When the proportion of the domain that is “wet” (the wet-area ratio) is small, subtle differences introduced through near-threshold misses can lead to large changes in FSS magnitude in individual cases (primarily because the bias is changed). Reliable statistics for small wet-area ratios require a larger sample of forecasts. Care needs to be taken in the choice of verification domain. For high-resolution models, the domain should be large enough to encompass the length scale of the typical mesoscale forcing (e.g., upper-level troughs or squall lines). If the domain is too large, the wet-area ratios will always be small. If the domain is too small, fluctuations in the wet-area ratio can be large and larger spatial errors may be missed. The FSS is a good measure of the spatial accuracy of precipitation forecasts. Different methods are needed to determine other patterns of behavior.

Download Full-text

Complex relationship between seasonal streamflow forecast skill and value in reservoir operations

Hydrology and Earth System Sciences ◽

10.5194/hess-21-4841-2017 ◽

2017 ◽

Vol 21 (9) ◽

pp. 4841-4859 ◽

Cited By ~ 38

Author(s):

Sean W. D. Turner ◽

James C. Bennett ◽

David E. Robertson ◽

Stefano Galelli

Keyword(s):

System Performance ◽

Research Effort ◽

Forecast Accuracy ◽

Strong Relationship ◽

Forecast Skill ◽

Severe Drought ◽

Forecast Performance ◽

Performance Improvements ◽

Sensitivity Assessment ◽

Seasonal Streamflow

Abstract. Considerable research effort has recently been directed at improving and operationalising ensemble seasonal streamflow forecasts. Whilst this creates new opportunities for improving the performance of water resources systems, there may also be associated risks. Here, we explore these potential risks by examining the sensitivity of forecast value (improvement in system performance brought about by adopting forecasts) to changes in the forecast skill for a range of hypothetical reservoir designs with contrasting operating objectives. Forecast-informed operations are simulated using rolling horizon, adaptive control and then benchmarked against optimised control rules to assess performance improvements. Results show that there exists a strong relationship between forecast skill and value for systems operated to maintain a target water level. But this relationship breaks down when the reservoir is operated to satisfy a target demand for water; good forecast accuracy does not necessarily translate into performance improvement. We show that the primary cause of this behaviour is the buffering role played by storage in water supply reservoirs, which renders the forecast superfluous for long periods of the operation. System performance depends primarily on forecast accuracy when critical decisions are made – namely during severe drought. As it is not possible to know in advance if a forecast will perform well at such moments, we advocate measuring the consistency of forecast performance, through bootstrap resampling, to indicate potential usefulness in storage operations. Our results highlight the need for sensitivity assessment in value-of-forecast studies involving reservoirs with supply objectives.

Download Full-text

Mathematical Development and Evaluation of Forecasting Models for Accuracy of Inflation in Developing Countries: A Case of Vietnam

Discrete Dynamics in Nature and Society ◽

10.1155/2015/858157 ◽

2015 ◽

Vol 2015 ◽

pp. 1-14 ◽

Cited By ~ 9

Author(s):

Nhu-Ty Nguyen ◽

Thanh-Tuyen Tran

Keyword(s):

Raw Materials ◽

Arima Model ◽

Consumer Price Index ◽

Forecast Accuracy ◽

Complex Problem ◽

Forecast Performance ◽

Forecasting Models ◽

Out Of Sample ◽

Forecasting Inflation ◽

Using Data

Inflation is a key element of a national economy, and it is also a prominent and important issue influencing the whole economy in terms of marketing. This is a complex problem requiring a large investment of time and wisdom to attain positive results. Thus, appropriate tools for forecasting inflation variables are crucial significant for policy making. In this study, both clarified value calculation and use of a genetic algorithm to find the optimal parameters are adopted simultaneously to construct improved models: ARIMA, GM(1,1), Verhulst, DGM(1,1), and DGM(2,1) by using data of Vietnamese inflation output from January 2005 to November 2013. The MAPE, MSE, RMSE, and MAD are four criteria with which the various forecasting models results are compared. Moreover, to see whether differences exist, Friedman and Wilcoxon tests are applied. Both in-sample and out-of-sample forecast performance results show that the ARIMA model has highly accurate forecasting in Raw Materials Price (RMP) and Gold Price (GP), whereas, the calculated results of GM(1,1) and DGM(1,1) are suitable to forecast Consumer Price Index (CPI). Therefore, the ARIMA, GM(1,1), and DGM(1,1) can handle the forecast accuracy of the issue, and they are suitable in modeling and forecasting of inflation in the case of Vietnam.

Download Full-text

Numerical simulation of rainfall with assimilation of conventional and GPS observations over north of Iran

Annals of Geophysics ◽

10.4401/ag-6919 ◽

2016 ◽

Vol 59 (3) ◽

Author(s):

Mohammad Ali Sharifi ◽

Majid Azadi ◽

Ali Sam Khaniani

Keyword(s):

Relative Humidity ◽

Caspian Sea ◽

Model Simulation ◽

Global Analysis ◽

Three Dimensional ◽

Wrf Model ◽

Precipitable Water ◽

Absolute Error ◽

The Caspian Sea ◽

North Of Iran

<p>In this work, the effect of assimilation of synoptic, radiosonde and ground-based GPS precipitable water vapor (PWV) data has been investigated on the short-term prediction of precipitation, vertical relative humidity and PWV fields over north of Iran. We selected two rainfall events (i.e. February 1, 2014, and September 17, 2014) caused by synoptic systems affecting the southern coasts of the Caspian Sea. These systems are often associated with a shallow and cold high pressure located over Russia that extends towards the southern Caspian Sea. The three dimensional variational (3DVAR) data assimilation system of the weather research and forecasting (WRF) model is used in two rainfall cases. In each case, three numerical experiments, namely CTRL, CONVDA and GPSCONVDA, are performed. The CTRL experiment uses the global analysis as the initial and boundary conditions of the model. In the second experiment, surface and radiosonde observations are inserted into the model. Finally, the GPSCONVDA experiment uses the GPS PWV data in the assimilation process in addition to the conventional observations. It is found that in CONVDA experiment, the mean absolute error (MAE) of the accumulated precipitation is reduced about 5 and 13 percent in 24h model simulation of February and September cases, respectively, when compared to CTRL. Also, the results in both cases suggest that the assimilation of GPS data has the greatest impact on model PWV simulations, with maximum root mean squares error (RMSE) reduction of 0.7 mm. In the GPSCONVDA experiment, comparison of the vertical proﬁles of 12h simulated relative humidity with the corresponding radiosonde observations shows a slight improvement in the lower levels.</p>

Download Full-text