scholarly journals Combining Random Forest and XGBoost Methods in Detecting Early and Mid-Term Winter Wheat Stripe Rust Using Canopy Level Hyperspectral Measurements

Agriculture ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 74
Author(s):  
Linsheng Huang ◽  
Yong Liu ◽  
Wenjiang Huang ◽  
Yingying Dong ◽  
Huiqin Ma ◽  
...  

Appropriate modeling methods and feature selection algorithms must be selected to improve the accuracy of early and mid-term remote sensing detection of wheat stripe rust. In the current study, we explored the effectiveness of the random forest (RF) algorithm combined with the extreme gradient boosting (XGboost) method for early and mid-term wheat stripe rust detection based on the vegetation indices extracted from canopy level hyperspectral measurements. Initially, 21 vegetation indices that were related to the early and mid-term winter wheat stripe rust were calculated on the basis of canopy level hyperspectral reflectance. Subsequently, the optimal vegetation index combination for disease detection was determined using correlation analysis (CA) combined with RF algorithms. Then, the disease severity detection model of early and mid-term winter wheat stripe rust was constructed using XGBoost method based on the optimal vegetation index combination. For the evaluation and comparison of the initial results, three commonly used classification methods, namely, RF, backpropagation neural network (BPNN), and support vector machine (SVM), were utilized. The vegetation index combinations determined by the single CA algorithm were also used to construct detection models. Compared with the detection models based on the vegetation index combination obtained using the single CA algorithm, the overall accuracy of the four detection models based on the optimal vegetation index combination based on CA combined with RF algorithms increased by 16.1% (XGBoost), 9.7% (RF), 8.1% (SVM), and 8.1% (BPNN). Among the eight models, the XGBoost detection model based on the optimal vegetation index combination using CA combined with RF algorithms, CA-RF-XGBoost, achieved the highest overall accuracy of 87.1% and the highest kappa coefficient of 0.798. Our results indicate that the RF combined with XGBoost can improve the detection accuracy of early and mid-term winter wheat stripe rust effectively at canopy scale.

2014 ◽  
Vol 602-605 ◽  
pp. 2462-2467
Author(s):  
Mu Yi Huang ◽  
Wen Jiang Huang ◽  
Xiao Dong Yang ◽  
Guang Zhou Chen

It was discussed of the selection method of characteristic spectral band and the establishing of inversion model to monitor winter wheat stripe rust using hyperspectral data in this study. The correlation coefficients between the DI (disease incidence) at different stages of infection and the initial canopy reflectance spectral and the derivative of the reflectance spectrum were compared, respectively. The results showed that the derivative of the reflectance spectra has reached higher significant level with the DI than the initial reflectance spectral data. The initial reflectance in the visible light 680nm wavelength and the near infrared 976nm, 1010nm wavelength were selected to do regression with the DI of winter wheat stripe rust. And some inversion models between the DI and the hyperspectral data or its conversion patterns like NDVI (Normalized difference vegetation index), RVI (Ratio vegetation index), TVI (Transformed vegetation index) and its differential values of the canopy spectral reflectance data to monitor winter wheat stripe rust were established. Meanwhile, those correlation coefficients were compared respectively, of which we found the pattern of vegetation index has more efficient commonly than initial canopy spectral reflectance data by aggression analysis with the DI. The paper also suggested that the possibility of developing a special visible/near-infrared sensor for the detection of the DI of winter wheat stripe rust theoretically. Else, the SRSI (stripe rust stress index) mechanism model was presented for the first time in this paper.


2021 ◽  
Vol 13 (15) ◽  
pp. 3024
Author(s):  
Huiqin Ma ◽  
Wenjiang Huang ◽  
Yingying Dong ◽  
Linyi Liu ◽  
Anting Guo

Fusarium head blight (FHB) is a major winter wheat disease in China. The accurate and timely detection of wheat FHB is vital to scientific field management. By combining three types of spectral features, namely, spectral bands (SBs), vegetation indices (VIs), and wavelet features (WFs), in this study, we explore the potential of using hyperspectral imagery obtained from an unmanned aerial vehicle (UAV), to detect wheat FHB. First, during the wheat filling period, two UAV-based hyperspectral images were acquired. SBs, VIs, and WFs that were sensitive to wheat FHB were extracted and optimized from the two images. Subsequently, a field-scale wheat FHB detection model was formulated, based on the optimal spectral feature combination of SBs, VIs, and WFs (SBs + VIs + WFs), using a support vector machine. Two commonly used data normalization algorithms were utilized before the construction of the model. The single WFs, and the spectral feature combination of optimal SBs and VIs (SBs + VIs), were respectively used to formulate models for comparison and testing. The results showed that the detection model based on the normalized SBs + VIs + WFs, using min–max normalization algorithm, achieved the highest R2 of 0.88 and the lowest RMSE of 2.68% among the three models. Our results suggest that UAV-based hyperspectral imaging technology is promising for the field-scale detection of wheat FHB. Combining traditional SBs and VIs with WFs can improve the detection accuracy of wheat FHB effectively.


2021 ◽  
Vol 13 (6) ◽  
pp. 1144
Author(s):  
Mahendra Bhandari ◽  
Shannon Baker ◽  
Jackie C. Rudd ◽  
Amir M. H. Ibrahim ◽  
Anjin Chang ◽  
...  

Drought significantly limits wheat productivity across the temporal and spatial domains. Unmanned Aerial Systems (UAS) has become an indispensable tool to collect refined spatial and high temporal resolution imagery data. A 2-year field study was conducted in 2018 and 2019 to determine the temporal effects of drought on canopy growth of winter wheat. Weekly UAS data were collected using red, green, and blue (RGB) and multispectral (MS) sensors over a yield trial consisting of 22 winter wheat cultivars in both irrigated and dryland environments. Raw-images were processed to compute canopy features such as canopy cover (CC) and canopy height (CH), and vegetation indices (VIs) such as Normalized Difference Vegetation Index (NDVI), Excess Green Index (ExG), and Normalized Difference Red-edge Index (NDRE). The drought was more severe in 2018 than in 2019 and the effects of growth differences across years and irrigation levels were visible in the UAS measurements. CC, CH, and VIs, measured during grain filling, were positively correlated with grain yield (r = 0.4–0.7, p < 0.05) in the dryland in both years. Yield was positively correlated with VIs in 2018 (r = 0.45–0.55, p < 0.05) in the irrigated environment, but the correlations were non-significant in 2019 (r = 0.1 to −0.4), except for CH. The study shows that high-throughput UAS data can be used to monitor the drought effects on wheat growth and productivity across the temporal and spatial domains.


2021 ◽  
Author(s):  
Lance F Merrick ◽  
Dennis N Lozada ◽  
Xianming Chen ◽  
Arron H Carter

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.


2021 ◽  
pp. 289-301
Author(s):  
B. Martín ◽  
J. González–Arias ◽  
J. A. Vicente–Vírseda

Our aim was to identify an optimal analytical approach for accurately predicting complex spatio–temporal patterns in animal species distribution. We compared the performance of eight modelling techniques (generalized additive models, regression trees, bagged CART, k–nearest neighbors, stochastic gradient boosting, support vector machines, neural network, and random forest –enhanced form of bootstrap. We also performed extreme gradient boosting –an enhanced form of radiant boosting– to predict spatial patterns in abundance of migrating Balearic shearwaters based on data gathered within eBird. Derived from open–source datasets, proxies of frontal systems and ocean productivity domains that have been previously used to characterize the oceanographic habitats of seabirds were quantified, and then used as predictors in the models. The random forest model showed the best performance according to the parameters assessed (RMSE value and R2). The correlation between observed and predicted abundance with this model was also considerably high. This study shows that the combination of machine learning techniques and massive data provided by open data sources is a useful approach for identifying the long–term spatial–temporal distribution of species at regional spatial scales.


Author(s):  
Harsha A K

Abstract: Since the advent of encryption, there has been a steady increase in malware being transmitted over encrypted networks. Traditional approaches to detect malware like packet content analysis are inefficient in dealing with encrypted data. In the absence of actual packet contents, we can make use of other features like packet size, arrival time, source and destination addresses and other such metadata to detect malware. Such information can be used to train machine learning classifiers in order to classify malicious and benign packets. In this paper, we offer an efficient malware detection approach using classification algorithms in machine learning such as support vector machine, random forest and extreme gradient boosting. We employ an extensive feature selection process to reduce the dimensionality of the chosen dataset. The dataset is then split into training and testing sets. Machine learning algorithms are trained using the training set. These models are then evaluated against the testing set in order to assess their respective performances. We further attempt to tune the hyper parameters of the algorithms, in order to achieve better results. Random forest and extreme gradient boosting algorithms performed exceptionally well in our experiments, resulting in area under the curve values of 0.9928 and 0.9998 respectively. Our work demonstrates that malware traffic can be effectively classified using conventional machine learning algorithms and also shows the importance of dimensionality reduction in such classification problems. Keywords: Malware Detection, Extreme Gradient Boosting, Random Forest, Feature Selection.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
vardhmaan jain ◽  
Vikram Sharma ◽  
Agam Bansal ◽  
Cerise Kleb ◽  
Chirag Sheth ◽  
...  

Background: Post-transplant major adverse cardiovascular events (MACE) are amongst the leading cause of death amongst orthotopic liver transplant(OLT) recipients. Despite years of guideline directed therapy, there are limited data on predictors of post-OLT MACE. We assessed if machine learning algorithms (MLA) can predict MACE and all-cause mortality in patients undergoing OLT. Methods: We tested three MLA: support vector machine, extreme gradient boosting(XG-Boost) and random forest with traditional logistic regression for prediction of MACE and all-cause mortality on a cohort of consecutive patients undergoing OLT at our center between 2008-2019. The cohort was randomly split into a training (80%) and testing (20%) cohort. Model performance was assessed using c-statistic or AUC. Results: We included 1,459 consecutive patients with mean ± SD age 54.2 ± 13.8 years, 32% female who underwent OLT. There were 199 (13.6%) MACE and 289 (20%) deaths at a mean follow up of 4.56 ± 3.3 years. The random forest MLA was the best performing model for predicting MACE [AUC:0.78, 95% CI: 0.70-0.85] as well as mortality [AUC:0.69, 95% CI: 0.61-0.76], with all models performing better when predicting MACE vs mortality. See Table and Figure. Conclusion: Random forest machine learning algorithms were more predictive and discriminative than traditional regression models for predicting major adverse cardiovascular events and all-cause mortality in patients undergoing OLT. Validation and subsequent incorporation of MLA in clinical decision making for OLT candidacy could help risk stratify patients for post-transplant adverse cardiovascular events.


2020 ◽  
Vol 12 (23) ◽  
pp. 3925
Author(s):  
Ivan Pilaš ◽  
Mateo Gašparović ◽  
Alan Novkinić ◽  
Damir Klobučar

The presented study demonstrates a bi-sensor approach suitable for rapid and precise up-to-date mapping of forest canopy gaps for the larger spatial extent. The approach makes use of Unmanned Aerial Vehicle (UAV) red, green and blue (RGB) images on smaller areas for highly precise forest canopy mask creation. Sentinel-2 was used as a scaling platform for transferring information from the UAV to a wider spatial extent. Various approaches to an improvement in the predictive performance were examined: (I) the highest R2 of the single satellite index was 0.57, (II) the highest R2 using multiple features obtained from the single-date, S-2 image was 0.624, and (III) the highest R2 on the multitemporal set of S-2 images was 0.697. Satellite indices such as Atmospherically Resistant Vegetation Index (ARVI), Infrared Percentage Vegetation Index (IPVI), Normalized Difference Index (NDI45), Pigment-Specific Simple Ratio Index (PSSRa), Modified Chlorophyll Absorption Ratio Index (MCARI), Color Index (CI), Redness Index (RI), and Normalized Difference Turbidity Index (NDTI) were the dominant predictors in most of the Machine Learning (ML) algorithms. The more complex ML algorithms such as the Support Vector Machines (SVM), Random Forest (RF), Stochastic Gradient Boosting (GBM), Extreme Gradient Boosting (XGBoost), and Catboost that provided the best performance on the training set exhibited weaker generalization capabilities. Therefore, a simpler and more robust Elastic Net (ENET) algorithm was chosen for the final map creation.


2019 ◽  
Vol 11 (16) ◽  
pp. 1932 ◽  
Author(s):  
Elena Prudnikova ◽  
Igor Savin ◽  
Gretelerika Vindeker ◽  
Praskovia Grubina ◽  
Ekaterina Shishkonakova ◽  
...  

The spectral reflectance of crop canopy is a spectral mixture, which includes soil background as one of the components. However, as soil is characterized by substantial spatial variability and temporal dynamics, its contribution to the spectral reflectance of crops will also vary. The aim of the research was to determine the impact of soil background on spectral reflectance of crop canopy in visible and near-infrared parts of the spectrum at different stages of crop development and how the soil type factor and the dynamics of soil surface affect vegetation indices calculated for crop assessment. The study was conducted on three test plots with winter wheat located in the Tula region of Russia and occupied by three contrasting types of soil. During field trips, information was collected on the spectral reflectance of winter wheat crop canopy, winter wheat leaves, weeds and open soil surface for three phenological phases (tillering, shooting stage, milky ripeness). The assessment of the soil contribution to the spectral reflectance of winter wheat crop canopy was based on a linear spectral mixture model constructed from field data. This showed that the soil background effect is most pronounced in the regions of 350–500 nm and 620–690 nm. In the shooting stage, the contribution of the soil prevails in the 620–690 nm range of the spectrum and the phase of milky ripeness in the region of 350–500 nm. The minimum contribution at all stages of winter wheat development was observed at wavelengths longer than 750 nm. The degree of soil influence varies with soil type. Analysis of variance showed that normalized difference vegetation index (NDVI) was least affected by soil type factor, the influence of which was about 30%–50%, depending on the stage of winter wheat development. The influence of soil type on soil-adjusted vegetation index (SAVI) and enhanced vegetation index (EVI2) was approximately equal and varied from 60% (shooting phase) to 80% (tillering phase). According to the discriminant analysis, the ability of vegetation indices calculated for winter wheat crop canopy to distinguish between winter wheat crops growing on different soil types changed from the classification accuracy of 94.1% (EVI2) in the tillering stage to 75% (EVI2 and SAVI) in the shooting stage to 82.6% in the milky ripeness stage (EVI2, SAVI, NDVI). The range of the sensitivity of the vegetation indices to the soil background depended on soil type. The indices showed the greatest sensitivity on gray forest soil when the wheat was in the phase of milky ripeness, and on leached chernozem when the wheat was in the tillering phase. The observed patterns can be used to develop vegetation indices, invariant to second-type soil variations caused by soil type factor, which can be applied for the remote assessment of the state of winter wheat crops.


2021 ◽  
Vol 13 (22) ◽  
pp. 4560
Author(s):  
Lili Luo ◽  
Qingrui Chang ◽  
Qi Wang ◽  
Yong Huang

Prompt monitoring of maize dwarf mosaic virus (MDMV) is critical for the prevention and control of disease and to ensure high crop yield and quality. Here, we first analyzed the spectral differences between MDMV-infected red leaves and healthy leaves and constructed a sensitive index (SI) for measurements. Next, based on the characteristic bands (Rλ) associated with leaf anthocyanins (Anth), we determined vegetation indices (VIs) commonly used in plant physiological and biochemical parameter inversion and established a vegetation index (VIc) by utilizing the combination of two arbitrary bands following the construction principles of NDVI, DVI, RVI, and SAVI. Furthermore, we developed classification models based on linear discriminant analysis (LDA) and support vector machine (SVM) in order to distinguish the red leaves from healthy leaves. Finally, we performed UR, MLR, PLSR, PCR, and SVM simulations on Anth based on Rλ, VIs, VIc, and Rλ + VIs + VIc and indirectly estimated the severity of MDMV infection based on the relationship between the reflection spectra and Anth. Distinct from those of the normal leaves, the spectra of red leaves showed strong reflectance characteristics at 640 nm, and SI increased with increasing Anth. Moreover, the accuracy of the two VIc-based classification models was 100%, which is significantly higher than that of the VIs and Rλ-based models. Among the Anth regression models, the accuracy of the MLR model based on Rλ + VIs + VIc was the highest (R2c = 0.85; R2v = 0.74). The developed models could accurately identify MDMV and estimate the severity of its infection, laying the theoretical foundation for large-scale remote sensing-based monitoring of this virus in the future.


Sign in / Sign up

Export Citation Format

Share Document