Simulation of Fluctuating Wind Speeds Based on Data-Driven Approaches

2014 ◽  
Vol 635-637 ◽  
pp. 1618-1623
Author(s):  
Yue Dan Wang ◽  
Chun Xiang Li

With the rapid development of information science and technology, data-driven approaches are already being the research tide in many fields. BP neural network (BPNN), support vector machine (SVM) and least squares support vector machine (LS-SVM) are introduced and adopted to simulate fluctuating time-series wind speeds in this paper. The regression-prediction models developed by implementing machine interpolation learning are established respectively. And the original speeds used as learning and forecast samples for the simulation of the data-driven approaches are obtained through AR numerical modeling. Based on the comparison of evaluation index, the results show that the simulated fluctuating wind speeds through SVM and LS-SVM are more accurate than the simulated speeds through BPNN, but the simulation time of LS-SVM and BPNN are shorter than the SVM.

2021 ◽  
Vol 13 (5) ◽  
pp. 1004
Author(s):  
Song Li ◽  
Tianhe Xu ◽  
Nan Jiang ◽  
Honglei Yang ◽  
Shuaimin Wang ◽  
...  

The meteorological reanalysis data has been widely applied to derive zenith tropospheric delay (ZTD) with a high spatial and temporal resolution. With the rapid development of artificial intelligence, machine learning also begins as a high-efficiency tool to be employed in modeling and predicting ZTD. In this paper, we develop three new regional ZTD models based on the least squares support vector machine (LSSVM), using both the International GNSS Service (IGS)-ZTD products and European Centre for Medium-Range Weather Forecasts Reanalysis 5 (ERA5) data over Europe throughout 2018. Among them, the ERA5 data is extended to ERA5S-ZTD and ERA5P-ZTD as the background data by the model method and integral method, respectively. Depending on different background data, three schemes are designed to construct ZTD models based on the LSSVM algorithm, including the without background data, with the ERA5S-ZTD, and with the ERA5P-ZTD. To investigate the advantage and feasibility of the proposed ZTD models, we evaluate the accuracy of two background data and three schemes by segmental comparison with the IGS-ZTD of 85 IGS stations in Europe. The results show that the overall average Root Mean Square Errors (RMSE) value of all sites is 30.1 mm for the ERA5S-ZTD, and 10.7 mm for the ERA5P-ZTD. The overall average RMSE is 25.8 mm, 22.9 mm, and 9 mm for the three schemes, respectively. Moreover, the overall improvement rate is 19.1% and 1.6% for the ZTD model with ERA5S-ZTD and ERA5P-ZTD, respectively. In order to explore the reason of the lower improvement for the ZTD model with ERA5P-ZTD, the loop verification is performed by estimating the ZTD values of each available IGS station. In actuality, the monthly improvement rate of estimated ZTD is positive for most stations, and the biggest improvement rate can even reach about 40%. The negative rate mainly comes from specific stations, these stations are located on the edge of the region, near the coast, as well as the lower similarity between the individual verified station and training stations.


2021 ◽  
Author(s):  
Lance F Merrick ◽  
Dennis N Lozada ◽  
Xianming Chen ◽  
Arron H Carter

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.


2020 ◽  
Author(s):  
Zhanyou Xu ◽  
Andreomar Kurek ◽  
Steven B. Cannon ◽  
Williams D. Beavis

AbstractSelection of markers linked to alleles at quantitative trait loci (QTL) for tolerance to Iron Deficiency Chlorosis (IDC) has not been successful. Genomic selection has been advocated for continuous numeric traits such as yield and plant height. For ordinal data types such as IDC, genomic prediction models have not been systematically compared. The objectives of research reported in this manuscript were to evaluate the most commonly used genomic prediction method, ridge regression and it’s equivalent logistic ridge regression method, with algorithmic modeling methods including random forest, gradient boosting, support vector machine, K-nearest neighbors, Naïve Bayes, and artificial neural network using the usual comparator metric of prediction accuracy. In addition we compared the methods using metrics of greater importance for decisions about selecting and culling lines for use in variety development and genetic improvement projects. These metrics include specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. We found that Support Vector Machine provided the best specificity for culling IDC susceptible lines, while Random Forest GP models provided the best combined set of decision metrics for retaining IDC tolerant and culling IDC susceptible lines.


Author(s):  
Jianmin Bian ◽  
Qian Wang ◽  
Siyu Nie ◽  
Hanli Wan ◽  
Juanjuan Wu

Abstract Fluctuations in groundwater depth play an important role and are often overlooked when considering the transport of nitrogen in the unsaturated zone. To evaluate directly the variation of nitrogen transport due to fluctuations in groundwater depth, the prediction model of groundwater depth and nitrogen transport were combined and applied by least squares support vector machine and Hydrus-1D in the western irrigation area of Jilin in China. The calibration and testing results showed the prediction models were reliable. Considering different groundwater depth, the concentration of nitrogen was affected significantly with a groundwater depth of 3.42–1.71 m, while it was not affected with groundwater depth of 5.48–6.47 m. The total leaching loss of nitrogen gradually increased with the continuous decrease of groundwater depth. Furthermore, the limited groundwater depth of 1.7 m was found to reduce the risk of nitrogen pollution. This paper systematically analyzes the relationship between groundwater depth and nitrogen transport to form appropriate agriculture strategies.


2020 ◽  
Vol 4 (4) ◽  
pp. 243-252
Author(s):  
SriUdaya Damuluri ◽  
Khondkar Islam ◽  
Pouyan Ahmadi ◽  
Namra Shafiq Qureshi

The advent of Learning Management System (LMS) has unfolded a unique opportunity to predict student grades well in advance which benefits both students and educational institutions. The objective of this study is to investigate student access patterns and navigational data of Blackboard (Bb), a form of LMS, to forecast final grades. This research study consists of students who are pursuing a Networking course in Information Science and Technology Department (IST) at George Mason University (GMU). The gathered data consists of a wide variety of attributes, such as the amount of time spent on lecture slides and other learning materials, number of times course contents are accessed, time and days of the week study material is reviewed, and student grades in various assessments. By analyzing these predictors using Support Vector Machine, one of the most efficient classification algorithms available, we are able to project final grades of students and identify those individuals who are at risk for failing the course so that they can receive proper guidance from instructors. After comparing actual grades with predicted grades, it is concluded that our developed model is able to accurately predict grades of 70% of the students. This study stands unique as it is the first to employ solely online LMS data to successfully deduce academic outcomes of students.


2022 ◽  
pp. 146808742110707
Author(s):  
Aran Mohammad ◽  
Reza Rezaei ◽  
Christopher Hayduk ◽  
Thaddaeus Delebinski ◽  
Saeid Shahpouri ◽  
...  

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.


Electronics ◽  
2018 ◽  
Vol 7 (12) ◽  
pp. 381 ◽  
Author(s):  
Yaping Liao ◽  
Junyou Zhang ◽  
Shufeng Wang ◽  
Sixian Li ◽  
Jian Han

Motor vehicle crashes remain a leading cause of life and property loss to society. Autonomous vehicles can mitigate the losses by making appropriate emergency decision, and the crash injury severity prediction model is the basis for autonomous vehicles to make decisions in emergency situations. In this paper, based on the support vector machine (SVM) model and NASS/GES crash data, three SVM crash injury severity prediction models (B-SVM, T-SVM, and BT-SVM) corresponding to braking, turning, and braking + turning respectively are established. The vehicle relative speed (REL_SPEED) and the gross vehicle weight rating (GVWR) are introduced into the impact indicators of the prediction models. Secondly, the ordered logit (OL) and back propagation neural network (BPNN) models are established to validate the accuracy of the SVM models. The results show that the SVM models have the best performance than the other two. Next, the impact of REL_SPEED and GVWR on injury severity is analyzed quantitatively by the sensitivity analysis, the results demonstrate that the increase of REL_SPEED and GVWR will make vehicle crash more serious. Finally, the same crash samples under normal road and environmental conditions are input into B-SVM, T-SVM, and BT-SVM respectively, the output results are compared and analyzed. The results show that with other conditions being the same, as the REL_SPEED increased from the low (0–20 mph) to middle (20–45 mph) and then to the high range (45–75 mph), the best emergency decision with the minimum crash injury severity will gradually transition from braking to turning and then to braking + turning.


2020 ◽  
Vol 211 ◽  
pp. 109795 ◽  
Author(s):  
Xiang Zhou ◽  
Ling Xu ◽  
Jingsi Zhang ◽  
Bing Niu ◽  
Maohui Luo ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document