scholarly journals Development and Validation of Machine-Learning Clear-Sky Detection Method Using 1-Min Irradiance Data and Sky Imagers at a Polluted Suburban Site, Xianghe

2021 ◽  
Vol 13 (18) ◽  
pp. 3763
Author(s):  
Mengqi Liu ◽  
Xiangao Xia ◽  
Disong Fu ◽  
Jinqiang Zhang

Clear-sky detection (CSD) is of critical importance in solar energy applications and surface radiative budget studies. Existing CSD methods are not sufficiently validated due to the lack of high-temporal resolution and long-term CSD ground observations, especially at polluted sites. Using five-year high resolution ground-based solar radiation data and visual inspected Total Sky Imager (TSI) measurements at polluted Xianghe, a suburban site, this study validated 17 existing CSD methods and developed a new CSD model based on a machine-learning algorithm (Random Forest: RF). The propagation of systematic errors from input data to the calculated global horizontal irradiance (GHI) is confirmed with Mean Absolute Error (MAE) increased by 99.7% (from 20.00 to 39.93 W·m−2). Through qualitative evaluation, the novel Bright-Sun method outperforms the other traditional CSD methods at Xianghe site, with high accuracy score 0.73 and 0.92 under clear and cloudy conditions, respectively. The RF CSD model developed by one-year irradiance and TSI data shows more robust performance, with clear/cloudy-sky accuracy score of 0.78/0.88. Overall, the Bright-Sun and RF CSD models perform satisfactorily at heavy polluted sites. Further analysis shows the RF CSD model built with only GHI-related parameters can still achieve a mean accuracy score of 0.81, which indicates RF CSD models have the potential in dealing with sites only providing GHI observations.

2019 ◽  
Author(s):  
Andrew Medford ◽  
Shengchun Yang ◽  
Fuzhu Liu

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CH<sub>x</sub>, NH<sub>x</sub> and OH<sub>x</sub> species on the oxygen vacancy and pristine rutile TiO<sub>2</sub>(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N<sup>1.12</sup>) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.


Author(s):  
Sachin Kumar ◽  
Karan Veer

Aims: The objective of this research is to predict the covid-19 cases in India based on the machine learning approaches. Background: Covid-19, a respiratory disease caused by one of the coronavirus family members, has led to a pandemic situation worldwide in 2020. This virus was detected firstly in Wuhan city of China in December 2019. This viral disease has taken less than three months to spread across the globe. Objective: In this paper, we proposed a regression model based on the Support vector machine (SVM) to forecast the number of deaths, the number of recovered cases, and total confirmed cases for the next 30 days. Method: For prediction, the data is collected from Github and the ministry of India's health and family welfare from March 14, 2020, to December 3, 2020. The model has been designed in Python 3.6 in Anaconda to forecast the forecasting value of corona trends until September 21, 2020. The proposed methodology is based on the prediction of values using SVM based regression model with polynomial, linear, rbf kernel. The dataset has been divided into train and test datasets with 40% and 60% test size and verified with real data. The model performance parameters are evaluated as a mean square error, mean absolute error, and percentage accuracy. Results and Conclusion: The results show that the polynomial model has obtained 95 % above accuracy score, linear scored above 90%, and rbf scored above 85% in predicting cumulative death, conformed cases, and recovered cases.


2021 ◽  
Vol 45 (1) ◽  
pp. 111-124
Author(s):  
Jaehee Cho ◽  
Sehwan Kim ◽  
Gwangjin Jeong ◽  
Chonghye Kim ◽  
Ja-Kyoung Seo

Objectives: In this study, we aimed to find the influential factors in determining individuals' use and non-use of fitness and diet apps on smartphones. To this end, we focused on diverse groups of predictors that would significantly affect people's use and non-use of these apps. Methods: Overall, we considered 105 factors as potential predictors and included them in further analyses using a machine learning algorithm, XGBoost. The main reason for selecting this particular algorithm was that it had been known as one of the most accurate and popular algorithms for predicting consumer behaviors. Results: We found the accuracy score of those factors for predicting people's use and non-use of fitness and diet apps was approximately 71.3%. In particular, the most influential predictors were mainly related to social influence, media use, overeating, social support, health management, and attitudes toward exercise. Conclusion: These findings contribute to helping scholars and practitioners to develop more practical strategies of the implementation of fitness and diet apps.


2019 ◽  
Author(s):  
Andrew Medford ◽  
Shengchun Yang ◽  
Fuzhu Liu

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CH<sub>x</sub>, NH<sub>x</sub> and OH<sub>x</sub> species on the oxygen vacancy and pristine rutile TiO<sub>2</sub>(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N<sup>1.12</sup>) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Raheel Siddiqui ◽  
Hafeez Anwar ◽  
Farman Ullah ◽  
Rehmat Ullah ◽  
Muhammad Abdul Rehman ◽  
...  

Power prediction is important not only for the smooth and economic operation of a combined cycle power plant (CCPP) but also to avoid technical issues such as power outages. In this work, we propose to utilize machine learning algorithms to predict the hourly-based electrical power generated by a CCPP. For this, the generated power is considered a function of four fundamental parameters which are relative humidity, atmospheric pressure, ambient temperature, and exhaust vacuum. The measurements of these parameters and their yielded output power are used to train and test the machine learning models. The dataset for the proposed research is gathered over a period of six years and taken from a standard and publicly available machine learning repository. The utilized machine algorithms are K -nearest neighbors (KNN), gradient-boosted regression tree (GBRT), linear regression (LR), artificial neural network (ANN), and deep neural network (DNN). We report state-of-the-art performance where GBRT outperforms not only the utilized algorithms but also all the previous methods on the given CCPP dataset. It achieves the minimum values of root mean square error (RMSE) of 2.58 and absolute error (AE) of 1.85.


Energies ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 2486
Author(s):  
Vanesa Mateo-Pérez ◽  
Marina Corral-Bobadilla ◽  
Francisco Ortega-Fernández ◽  
Vicente Rodríguez-Montequín

One of the fundamental maintenance tasks of ports is the periodic dredging of them. This is necessary to guarantee a minimum draft that will enable ships to access ports safely. The determination of bathymetries is the instrument that determines the need for dredging and permits an analysis of the behavior of the port bottom over time, in order to achieve adequate water depth. Satellite data processing to predict environmental parameters is used increasingly. Based on satellite data and using different machine learning algorithm techniques, this study has sought to estimate the seabed in ports, taking into account the fact that the port areas are strongly anthropized areas. The algorithms that were used were Support Vector Machine (SVM), Random Forest (RF) and the Multi-Adaptive Regression Splines (MARS). The study was carried out in the ports of Candás and Luarca in the Principality of Asturias. In order to validate the results obtained, data was acquired in situ by using a single beam provided. The results show that this type of methodology can be used to estimate coastal bathymetry. However, when deciding which system was best, priority was given to simplicity and robustness. The results of the SVM and RF algorithms outperform those of the MARS. RF performs better in Candás with a mean absolute error (MAE) of 0.27 cm, whereas SVM performs better in Luarca with a mean absolute error of 0.37 cm. It is suggested that this approach is suitable as a simpler and more cost-effective rough resolution alternative, for estimating the depth of turbid water in ports, than single-beam sonar, which is labor-intensive and polluting.


2020 ◽  
Author(s):  
Aditya Thawani ◽  
Ryan-Rhys Griffiths ◽  
Arian Jamasb ◽  
Anthony Bourached ◽  
Penelope Jones ◽  
...  

The space of synthesizable molecules is greater than $10^{60}$, meaning only a vanishingly small fraction of these molecules have ever been realized in the lab. In order to prioritize which regions of this space to explore next, synthetic chemists need access to accurate molecular property predictions. While great advances in molecular machine learning have been made, there is a dearth of benchmarks featuring properties that are useful for the synthetic chemist. Focussing directly on the needs of the synthetic chemist, we introduce the Photoswitch Dataset, a new benchmark for molecular machine learning where improvements in model performance can be immediately observed in the throughput of promising molecules synthesized in the lab. Photoswitches are a versatile class of molecule for medical and renewable energy applications where a molecule's efficacy is governed by its electronic transition wavelengths. We demonstrate superior performance in predicting these wavelengths compared to both time-dependent density functional theory (TD-DFT), the incumbent first principles quantum mechanical approach, as well as a panel of human experts. Our baseline models are currently being deployed in the lab as part of the decision process for candidate synthesis. It is our hope that this benchmark can drive real discoveries in photoswitch chemistry and that future benchmarks can be introduced to pivot learning algorithm development to benefit more expansive areas of synthetic chemistry.


Author(s):  
Chitluri Sai Harish B ◽  
G gnana krishna vamsi ◽  
G jaya phani akhil ◽  
J n v hari sravan ◽  
V mounika chowdary

Heart diseases are one of the most challenging problems faced by the Health Care sectors all over the world. These diseases are very basic now a days. With the expanding count of deaths because of heart illnesses, the necessity to build up a system to foresee heart ailments precisely. The work in this paper focuses on finding the best Machine Learning algorithm for identification of heart diseases. Our study compares the precision of three well known classification algorithms, Decision Tree and Naïve Bayes, Random Forest for the prediction of heart disease by making the use of dataset provided by Kaggle. We utilized various characteristics which relate with this heart diseases well, to find the better algorithm for prediction. The result of this study indicates that the Random Forest algorithm is the most efficient algorithm for prediction of heart disease with accuracy score of 97.17%.


Energies ◽  
2020 ◽  
Vol 13 (18) ◽  
pp. 4868
Author(s):  
Raghuram Kalyanam ◽  
Sabine Hoffmann

Solar radiation data is essential for the development of many solar energy applications ranging from thermal collectors to building simulation tools, but its availability is limited, especially the diffuse radiation component. There are several studies aimed at predicting this value, but very few studies cover the generalizability of such models on varying climates. Our study investigates how well these models generalize and also show how to enhance their generalizability on different climates. Since machine learning approaches are known to generalize well, we apply them to truly understand how well they perform on different climates than they are originally trained. Therefore, we trained them on datasets from the U.S. and tested on several European climates. The machine learning model that is developed for U.S. climates not only showed low mean absolute error (MAE) of 23 W/m2, but also generalized very well on European climates with MAE in the range of 20 to 27 W/m2. Further investigation into the factors influencing the generalizability revealed that careful selection of the training data can improve the results significantly.


Author(s):  
Aditya Thawani ◽  
Ryan-Rhys Griffiths ◽  
Arian Jamasb ◽  
Anthony Bourached ◽  
Penelope Jones ◽  
...  

The space of synthesizable molecules is greater than $10^{60}$, meaning only a vanishingly small fraction of these molecules have ever been realized in the lab. In order to prioritize which regions of this space to explore next, synthetic chemists need access to accurate molecular property predictions. While great advances in molecular machine learning have been made, there is a dearth of benchmarks featuring properties that are useful for the synthetic chemist. Focussing directly on the needs of the synthetic chemist, we introduce the Photoswitch Dataset, a new benchmark for molecular machine learning where improvements in model performance can be immediately observed in the throughput of promising molecules synthesized in the lab. Photoswitches are a versatile class of molecule for medical and renewable energy applications where a molecule's efficacy is governed by its electronic transition wavelengths. We demonstrate superior performance in predicting these wavelengths compared to both time-dependent density functional theory (TD-DFT), the incumbent first principles quantum mechanical approach, as well as a panel of human experts. Our baseline models are currently being deployed in the lab as part of the decision process for candidate synthesis. It is our hope that this benchmark can drive real discoveries in photoswitch chemistry and that future benchmarks can be introduced to pivot learning algorithm development to benefit more expansive areas of synthetic chemistry.


Sign in / Sign up

Export Citation Format

Share Document