Forecasting sub-resolution temporal variability of irradiance

Author(s):  
Frank Kreuwel ◽  
Chiel van Heerwaarden

<p>Variability of solar irradiance is an important factor concerning large-scale integration of solar photovoltaics (PV) systems onto the electricity grid. Calculations of irradiance are computationally expensive, leaving operational meso-scale forecasting models struggling to achieve accurate results. Moreover, such models deliver outputs at a temporal resolution in the order of hours, whereas from a grid-integration point of view, minute-to-minute variability is a major concern. In previous work, we found that absolute power peaks in the order of seconds are up to 18% higher compared to 15-minute resolution for irradiance and even upwards of 22% higher for household PV systems. Moreover, these maximum peaks in output power are solely observed under mixed-cloud conditions, for which alse the greatest variability is found. In this work we present a machine-learning model which can forecast sub-resolution variability of irradiance, based on standard meso-scale outputs of the HARMONIE model of the The Royal Netherlands Meteorological Institute (KNMI). For training and validation, irradiance measurements obtained at a 1-second interval are used of the Baseline Surface Radiation Network (BSRN) site of Cabauw. A tree-based model was employed, for which the optimum members were constructed using extreme gradient boosting. In this work, we explore the dominant features of the model and link the machine-learned-relations to meteorological processes and dynamics. This research was executed in collaboration with the Distribution Grid Operator Alliander.</p>

2019 ◽  
Vol 11 (12) ◽  
pp. 1505 ◽  
Author(s):  
Heng Zhang ◽  
Anwar Eziz ◽  
Jian Xiao ◽  
Shengli Tao ◽  
Shaopeng Wang ◽  
...  

Accurate mapping of vegetation is a premise for conserving, managing, and sustainably using vegetation resources, especially in conditions of intensive human activities and accelerating global changes. However, it is still challenging to produce high-resolution multiclass vegetation map in high accuracy, due to the incapacity of traditional mapping techniques in distinguishing mosaic vegetation classes with subtle differences and the paucity of fieldwork data. This study created a workflow by adopting a promising classifier, extreme gradient boosting (XGBoost), to produce accurate vegetation maps of two strikingly different cases (the Dzungarian Basin in China and New Zealand) based on extensive features and abundant vegetation data. For the Dzungarian Basin, a vegetation map with seven vegetation types, 17 subtypes, and 43 associations was produced with an overall accuracy of 0.907, 0.801, and 0.748, respectively. For New Zealand, a map of 10 habitats and a map of 41 vegetation classes were produced with 0.946, and 0.703 overall accuracy, respectively. The workflow incorporating simplified field survey procedures outperformed conventional field survey and remote sensing based methods in terms of accuracy and efficiency. In addition, it opens a possibility of building large-scale, high-resolution, and timely vegetation monitoring platforms for most terrestrial ecosystems worldwide with the aid of Google Earth Engine and citizen science programs.


Energies ◽  
2019 ◽  
Vol 12 (19) ◽  
pp. 3798 ◽  
Author(s):  
Mansouri ◽  
Lashab ◽  
Sera ◽  
Guerrero ◽  
Cherif

Renewable energy systems (RESs), such as photovoltaic (PV) systems, are providing increasingly larger shares of power generation. PV systems are the fastest growing generation technology today with almost ~30% increase since 2015 reaching 509.3 GWp worldwide capacity by the end of 2018 and predicted to reach 1000 GWp by 2022. Due to the fluctuating and intermittent nature of PV systems, their large-scale integration into the grid poses momentous challenges. This paper provides a review of the technical challenges, such as frequency disturbances and voltage limit violation, related to the stability issues due to the large-scale and intensive PV system penetration into the power network. Possible solutions that mitigate the effect of large-scale PV system integration on the grid are also reviewed. Finally, power system stability when faults occur are outlined as well as their respective achievable solutions.


2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Ke Li ◽  
Sijia Zhang ◽  
Di Yan ◽  
Yannan Bin ◽  
Junfeng Xia

Abstract Background Identification of hot spots in protein-DNA interfaces provides crucial information for the research on protein-DNA interaction and drug design. As experimental methods for determining hot spots are time-consuming, labor-intensive and expensive, there is a need for developing reliable computational method to predict hot spots on a large scale. Results Here, we proposed a new method named sxPDH based on supervised isometric feature mapping (S-ISOMAP) and extreme gradient boosting (XGBoost) to predict hot spots in protein-DNA complexes. We obtained 114 features from a combination of the protein sequence, structure, network and solvent accessible information, and systematically assessed various feature selection methods and feature dimensionality reduction methods based on manifold learning. The results show that the S-ISOMAP method is superior to other feature selection or manifold learning methods. XGBoost was then used to develop hot spots prediction model sxPDH based on the three dimensionality-reduced features obtained from S-ISOMAP. Conclusion Our method sxPDH boosts prediction performance using S-ISOMAP and XGBoost. The AUC of the model is 0.773, and the F1 score is 0.713. Experimental results on benchmark dataset indicate that sxPDH can achieve generally better performance in predicting hot spots compared to the state-of-the-art methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Mingyue Xue ◽  
Yinxia Su ◽  
Chen Li ◽  
Shuxia Wang ◽  
Hua Yao

Background. An estimated 425 million people globally have diabetes, accounting for 12% of the world’s health expenditures, and the number continues to grow, placing a huge burden on the healthcare system, especially in those remote, underserved areas. Methods. A total of 584,168 adult subjects who have participated in the national physical examination were enrolled in this study. The risk factors for type II diabetes mellitus (T2DM) were identified by p values and odds ratio, using logistic regression (LR) based on variables of physical measurement and a questionnaire. Combined with the risk factors selected by LR, we used a decision tree, a random forest, AdaBoost with a decision tree (AdaBoost), and an extreme gradient boosting decision tree (XGBoost) to identify individuals with T2DM, compared the performance of the four machine learning classifiers, and used the best-performing classifier to output the degree of variables’ importance scores of T2DM. Results. The results indicated that XGBoost had the best performance (accuracy=0.906, precision=0.910, recall=0.902, F‐1=0.906, and AUC=0.968). The degree of variables’ importance scores in XGBoost showed that BMI was the most significant feature, followed by age, waist circumference, systolic pressure, ethnicity, smoking amount, fatty liver, hypertension, physical activity, drinking status, dietary ratio (meat to vegetables), drink amount, smoking status, and diet habit (oil loving). Conclusions. We proposed a classifier based on LR-XGBoost which used fourteen variables of patients which are easily obtained and noninvasive as predictor variables to identify potential incidents of T2DM. The classifier can accurately screen the risk of diabetes in the early phrase, and the degree of variables’ importance scores gives a clue to prevent diabetes occurrence.


2021 ◽  
Vol 10 (10) ◽  
pp. 680
Author(s):  
Annan Yang ◽  
Chunmei Wang ◽  
Guowei Pang ◽  
Yongqing Long ◽  
Lei Wang ◽  
...  

Gully erosion is the most severe type of water erosion and is a major land degradation process. Gully erosion susceptibility mapping (GESM)’s efficiency and interpretability remains a challenge, especially in complex terrain areas. In this study, a WoE-MLC model was used to solve the above problem, which combines machine learning classification algorithms and the statistical weight of evidence (WoE) model in the Loess Plateau. The three machine learning (ML) algorithms utilized in this research were random forest (RF), gradient boosted decision trees (GBDT), and extreme gradient boosting (XGBoost). The results showed that: (1) GESM were well predicted by combining both machine learning regression models and WoE-MLC models, with the area under the curve (AUC) values both greater than 0.92, and the latter was more computationally efficient and interpretable; (2) The XGBoost algorithm was more efficient in GESM than the other two algorithms, with the strongest generalization ability and best performance in avoiding overfitting (averaged AUC = 0.947), followed by the RF algorithm (averaged AUC = 0.944), and GBDT algorithm (averaged AUC = 0.938); and (3) slope gradient, land use, and altitude were the main factors for GESM. This study may provide a possible method for gully erosion susceptibility mapping at large scale.


2018 ◽  
Vol 60 (1) ◽  
pp. 104-110
Author(s):  
Marius Paulescu ◽  
Nicoleta Stefu ◽  
Ciprian Dughir ◽  
Robert Blaga ◽  
Andreea Sabadus ◽  
...  

AbstractForecasting the solar energy production is a key issue in the large-scale integration of the photovoltaic plants into the existing electricity grid. This paper reports on the research progress in forecasting the solar energy production at the West University of Timisoara, Romania. Firstly, the experimental facilities commissioned on the Solar Platform for testing the forecasting models are briefly described. Secondly, a new tool for the online forecasting of the solar energy production is introduced. Preliminary tests show that the implemented procedure is a successful trade-off between simplicity and accuracy.


2021 ◽  
Author(s):  
Leila Zahedi ◽  
Farid Ghareh Mohammadi ◽  
M. Hadi Amini

<p>Machine learning techniques lend themselves as promising decision-making and analytic tools in a wide range of applications. Different ML algorithms have various hyper-parameters. In order to tailor an ML model towards a specific application working at its best, its hyper-parameters should be tuned. Tuning the hyper-parameters directly affects the performance. However, for large-scale search spaces, efficiently exploring the ample number of combinations of hyper-parameters is computationally expensive. Many of the automated hyper-parameter tuning techniques suffer from low convergence rates and high experimental time complexities. In this paper, we propose HyP-ABC, an automatic innovative hybrid hyper-parameter optimization algorithm using the modified artificial bee colony approach, to measure the classification accuracy of three ML algorithms: random forest, extreme gradient boosting, and support vector machine. In order to ensure the robustness of the proposed method, the algorithm takes a wide range of feasible hyper-parameter values and is tested using a real-world educational dataset. Experimental results show that HyP-ABC is competitive with state-of-the-art techniques. Also, it has fewer hyper-parameters to be tuned than other population-based algorithms, making it worthwhile for real-world HPO problems.</p>


2020 ◽  
Author(s):  
Ali Movahedi ◽  
Sybil Derrible

As cities keep growing worldwide, so does the demand for key resources such as energy (electricity and gas) and water that residents consume. Meeting the demand for these resources can be challenging and requires an understanding of their consumptions patterns. In this work, we apply XGBoost (Extreme Gradient Boosting) to predict and analyze water and energy consumption in large-scale buildings in New York City. For this, the New York City’s local law 84 extensive dataset was merged with the Primary Land Use Tax Lot Output (PLUTO) dataset as well as with other socio-economic databases. Specifically, we developed three models: electricity, gas, and water consumption. Seven major lessons were learnt in terms of interrelationships between electricity, gas, and water consumption. In particular, water and gas consumption are highly interrelated with one another (often because gas is used for water heating). Furthermore, electricity consumption is affected by building type, and electricity and water consumption are particularly interrelated in nonresidential buildings. Overall, the knowledge gained from the models and from the SHAP analysis can help planners, engineers, and policymakers develop more effective strategies and help them manage the demand for energy and water in large-scale buildings.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 765 ◽  
Author(s):  
Weizhang Liang ◽  
Suizhi Luo ◽  
Guoyan Zhao ◽  
Hao Wu

Predicting pillar stability is a vital task in hard rock mines as pillar instability can cause large-scale collapse hazards. However, it is challenging because the pillar stability is affected by many factors. With the accumulation of pillar stability cases, machine learning (ML) has shown great potential to predict pillar stability. This study aims to predict hard rock pillar stability using gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) algorithms. First, 236 cases with five indicators were collected from seven hard rock mines. Afterwards, the hyperparameters of each model were tuned using a five-fold cross validation (CV) approach. Based on the optimal hyperparameters configuration, prediction models were constructed using training set (70% of the data). Finally, the test set (30% of the data) was adopted to evaluate the performance of each model. The precision, recall, and F1 indexes were utilized to analyze prediction results of each level, and the accuracy and their macro average values were used to assess the overall prediction performance. Based on the sensitivity analysis of indicators, the relative importance of each indicator was obtained. In addition, the safety factor approach and other ML algorithms were adopted as comparisons. The results showed that GBDT, XGBoost, and LightGBM algorithms achieved a better comprehensive performance, and their prediction accuracies were 0.8310, 0.8310, and 0.8169, respectively. The average pillar stress and ratio of pillar width to pillar height had the most important influences on prediction results. The proposed methodology can provide a reliable reference for pillar design and stability risk management.


Sign in / Sign up

Export Citation Format

Share Document