Simulation of Fluctuating Wind Speeds Based on Data-Driven Approaches

With the rapid development of information science and technology, data-driven approaches are already being the research tide in many fields. BP neural network (BPNN), support vector machine (SVM) and least squares support vector machine (LS-SVM) are introduced and adopted to simulate fluctuating time-series wind speeds in this paper. The regression-prediction models developed by implementing machine interpolation learning are established respectively. And the original speeds used as learning and forecast samples for the simulation of the data-driven approaches are obtained through AR numerical modeling. Based on the comparison of evaluation index, the results show that the simulated fluctuating wind speeds through SVM and LS-SVM are more accurate than the simulated speeds through BPNN, but the simulation time of LS-SVM and BPNN are shorter than the SVM.

Download Full-text

Regional Zenith Tropospheric Delay Modeling Based on Least Squares Support Vector Machine Using GNSS and ERA5 Data

Remote Sensing ◽

10.3390/rs13051004 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1004

Author(s):

Song Li ◽

Tianhe Xu ◽

Nan Jiang ◽

Honglei Yang ◽

Shuaimin Wang ◽

...

Keyword(s):

Support Vector Machine ◽

Least Squares ◽

High Efficiency ◽

Rapid Development ◽

Support Vector ◽

Tropospheric Delay ◽

Improvement Rate ◽

Zenith Tropospheric Delay ◽

Background Data ◽

Mean Square Errors

The meteorological reanalysis data has been widely applied to derive zenith tropospheric delay (ZTD) with a high spatial and temporal resolution. With the rapid development of artificial intelligence, machine learning also begins as a high-efficiency tool to be employed in modeling and predicting ZTD. In this paper, we develop three new regional ZTD models based on the least squares support vector machine (LSSVM), using both the International GNSS Service (IGS)-ZTD products and European Centre for Medium-Range Weather Forecasts Reanalysis 5 (ERA5) data over Europe throughout 2018. Among them, the ERA5 data is extended to ERA5S-ZTD and ERA5P-ZTD as the background data by the model method and integral method, respectively. Depending on different background data, three schemes are designed to construct ZTD models based on the LSSVM algorithm, including the without background data, with the ERA5S-ZTD, and with the ERA5P-ZTD. To investigate the advantage and feasibility of the proposed ZTD models, we evaluate the accuracy of two background data and three schemes by segmental comparison with the IGS-ZTD of 85 IGS stations in Europe. The results show that the overall average Root Mean Square Errors (RMSE) value of all sites is 30.1 mm for the ERA5S-ZTD, and 10.7 mm for the ERA5P-ZTD. The overall average RMSE is 25.8 mm, 22.9 mm, and 9 mm for the three schemes, respectively. Moreover, the overall improvement rate is 19.1% and 1.6% for the ZTD model with ERA5S-ZTD and ERA5P-ZTD, respectively. In order to explore the reason of the lower improvement for the ZTD model with ERA5P-ZTD, the loop verification is performed by estimating the ZTD values of each available IGS station. In actuality, the monthly improvement rate of estimated ZTD is positive for most stations, and the biggest improvement rate can even reach about 40%. The negative rate mainly comes from specific stations, these stations are located on the edge of the region, near the coast, as well as the lower similarity between the individual verified station and training stations.

Download Full-text

Classification and Regression Models for Genomic Selection of Skewed Phenotypes: A Case for Disease Resistance in Winter Wheat (Triticum aestivum L.)

10.1101/2021.12.16.472985 ◽

2021 ◽

Author(s):

Lance F Merrick ◽

Dennis N Lozada ◽

Xianming Chen ◽

Arron H Carter

Keyword(s):

Support Vector Machine ◽

Winter Wheat ◽

Genomic Selection ◽

Stripe Rust ◽

Regression Models ◽

Prediction Models ◽

Support Vector ◽

Classification Models ◽

Breeding Lines ◽

Classification And Regression

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.

Download Full-text

Algorithmic and data modeling: Will algorithmic modeling improve predictions of traits evaluated on ordinal scales?

10.1101/2020.10.07.329466 ◽

2020 ◽

Author(s):

Zhanyou Xu ◽

Andreomar Kurek ◽

Steven B. Cannon ◽

Williams D. Beavis

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Ridge Regression ◽

Genomic Prediction ◽

Ordinal Data ◽

Prediction Models ◽

Characteristic Curve ◽

Gradient Boosting ◽

Support Vector ◽

Data Types

AbstractSelection of markers linked to alleles at quantitative trait loci (QTL) for tolerance to Iron Deficiency Chlorosis (IDC) has not been successful. Genomic selection has been advocated for continuous numeric traits such as yield and plant height. For ordinal data types such as IDC, genomic prediction models have not been systematically compared. The objectives of research reported in this manuscript were to evaluate the most commonly used genomic prediction method, ridge regression and it’s equivalent logistic ridge regression method, with algorithmic modeling methods including random forest, gradient boosting, support vector machine, K-nearest neighbors, Naïve Bayes, and artificial neural network using the usual comparator metric of prediction accuracy. In addition we compared the methods using metrics of greater importance for decisions about selecting and culling lines for use in variety development and genetic improvement projects. These metrics include specificity, sensitivity, precision, decision accuracy, and area under the receiver operating characteristic curve. We found that Support Vector Machine provided the best specificity for culling IDC susceptible lines, while Random Forest GP models provided the best combined set of decision metrics for retaining IDC tolerant and culling IDC susceptible lines.

Download Full-text

Understanding nitrogen transport in the unsaturated zone with fluctuations in groundwater depth

Water Science & Technology Water Supply ◽

10.2166/ws.2021.066 ◽

2021 ◽

Author(s):

Jianmin Bian ◽

Qian Wang ◽

Siyu Nie ◽

Hanli Wan ◽

Juanjuan Wu

Keyword(s):

Support Vector Machine ◽

Unsaturated Zone ◽

Prediction Models ◽

Groundwater Depth ◽

Support Vector ◽

Nitrogen Transport ◽

Irrigation Area ◽

Leaching Loss ◽

The Relationship ◽

Hydrus 1D

Abstract Fluctuations in groundwater depth play an important role and are often overlooked when considering the transport of nitrogen in the unsaturated zone. To evaluate directly the variation of nitrogen transport due to fluctuations in groundwater depth, the prediction model of groundwater depth and nitrogen transport were combined and applied by least squares support vector machine and Hydrus-1D in the western irrigation area of Jilin in China. The calibration and testing results showed the prediction models were reliable. Considering different groundwater depth, the concentration of nitrogen was affected significantly with a groundwater depth of 3.42–1.71 m, while it was not affected with groundwater depth of 5.48–6.47 m. The total leaching loss of nitrogen gradually increased with the continuous decrease of groundwater depth. Furthermore, the limited groundwater depth of 1.7 m was found to reduce the risk of nitrogen pollution. This paper systematically analyzes the relationship between groundwater depth and nitrogen transport to form appropriate agriculture strategies.

Download Full-text

Analyzing Navigational Data and Predicting Student Grades Using Support Vector Machine

Emerging Science Journal ◽

10.28991/esj-2020-01227 ◽

2020 ◽

Vol 4 (4) ◽

pp. 243-252

Author(s):

SriUdaya Damuluri ◽

Khondkar Islam ◽

Pouyan Ahmadi ◽

Namra Shafiq Qureshi

Keyword(s):

Support Vector Machine ◽

Information Science ◽

Research Study ◽

Educational Institutions ◽

Support Vector ◽

Classification Algorithms ◽

Student Access ◽

Final Grades ◽

Student Grades ◽

Access Patterns

The advent of Learning Management System (LMS) has unfolded a unique opportunity to predict student grades well in advance which benefits both students and educational institutions. The objective of this study is to investigate student access patterns and navigational data of Blackboard (Bb), a form of LMS, to forecast final grades. This research study consists of students who are pursuing a Networking course in Information Science and Technology Department (IST) at George Mason University (GMU). The gathered data consists of a wide variety of attributes, such as the amount of time spent on lecture slides and other learning materials, number of times course contents are accessed, time and days of the week study material is reviewed, and student grades in various assessments. By analyzing these predictors using Support Vector Machine, one of the most efficient classification algorithms available, we are able to project final grades of students and identify those individuals who are at risk for failing the course so that they can receive proper guidance from instructors. After comparing actual grades with predicted grades, it is concluded that our developed model is able to accurately predict grades of 70% of the students. This study stands unique as it is the first to employ solely online LMS data to successfully deduce academic outcomes of students.

Download Full-text

Fault diagnosis of valve clearance in diesel engine based on BP neural network and support vector machine

Transactions of Tianjin University ◽

10.1007/s12209-016-2675-1 ◽

2016 ◽

Vol 22 (6) ◽

pp. 536-543 ◽

Cited By ~ 4

Author(s):

Fengrong Bi ◽

Yiping Liu

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Fault Diagnosis ◽

Diesel Engine ◽

Bp Neural Network ◽

Support Vector

Download Full-text

Physical-oriented and machine learning-based emission modeling in a diesel compression ignition engine: Dimensionality reduction and regression

International Journal of Engine Research ◽

10.1177/14680874211070736 ◽

2022 ◽

pp. 146808742110707

Author(s):

Aran Mohammad ◽

Reza Rezaei ◽

Christopher Hayduk ◽

Thaddaeus Delebinski ◽

Saeid Shahpouri ◽

...

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Factor Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Data Driven ◽

Support Vector ◽

Emission Models ◽

Emission Modeling

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.

Download Full-text

Study on Crash Injury Severity Prediction of Autonomous Vehicles for Different Emergency Decisions Based on Support Vector Machine Model

Electronics ◽

10.3390/electronics7120381 ◽

2018 ◽

Vol 7 (12) ◽

pp. 381 ◽

Cited By ~ 6

Author(s):

Yaping Liao ◽

Junyou Zhang ◽

Shufeng Wang ◽

Sixian Li ◽

Jian Han

Keyword(s):

Support Vector Machine ◽

Autonomous Vehicles ◽

Injury Severity ◽

Prediction Models ◽

Motor Vehicle ◽

Support Vector ◽

Severity Prediction ◽

Emergency Decision ◽

The Impact ◽

Crash Injury Severity

Motor vehicle crashes remain a leading cause of life and property loss to society. Autonomous vehicles can mitigate the losses by making appropriate emergency decision, and the crash injury severity prediction model is the basis for autonomous vehicles to make decisions in emergency situations. In this paper, based on the support vector machine (SVM) model and NASS/GES crash data, three SVM crash injury severity prediction models (B-SVM, T-SVM, and BT-SVM) corresponding to braking, turning, and braking + turning respectively are established. The vehicle relative speed (REL_SPEED) and the gross vehicle weight rating (GVWR) are introduced into the impact indicators of the prediction models. Secondly, the ordered logit (OL) and back propagation neural network (BPNN) models are established to validate the accuracy of the SVM models. The results show that the SVM models have the best performance than the other two. Next, the impact of REL_SPEED and GVWR on injury severity is analyzed quantitatively by the sensitivity analysis, the results demonstrate that the increase of REL_SPEED and GVWR will make vehicle crash more serious. Finally, the same crash samples under normal road and environmental conditions are input into B-SVM, T-SVM, and BT-SVM respectively, the output results are compared and analyzed. The results show that with other conditions being the same, as the REL_SPEED increased from the low (0–20 mph) to middle (20–45 mph) and then to the high range (45–75 mph), the best emergency decision with the minimum crash injury severity will gradually transition from braking to turning and then to braking + turning.

Download Full-text