scholarly journals Statistical Stability and Spatial Instability in Mapping Forest Tree Species by Comparing 9 Years of Satellite Image Time Series

2019 ◽  
Vol 11 (21) ◽  
pp. 2512 ◽  
Author(s):  
Nicolas Karasiak ◽  
Jean-François Dejoux ◽  
Mathieu Fauvel ◽  
Jérôme Willm ◽  
Claude Monteil ◽  
...  

Mapping forest composition using multiseasonal optical time series remains a challenge. Highly contrasted results are reported from one study to another suggesting that drivers of classification errors are still under-explored. We evaluated the performances of single-year Formosat-2 time series to discriminate tree species in temperate forests in France and investigated how predictions vary statistically and spatially across multiple years. Our objective was to better estimate the impact of spatial autocorrelation in the validation data on measurement accuracy and to understand which drivers in the time series are responsible for classification errors. The experiments were based on 10 Formosat-2 image time series irregularly acquired during the seasonal vegetation cycle from 2006 to 2014. Due to lot of clouds in the year 2006, an alternative 2006 time series using only cloud-free images has been added. Thirteen tree species were classified in each single-year dataset based on the Support Vector Machine (SVM) algorithm. The performances were assessed using a spatial leave-one-out cross validation (SLOO-CV) strategy, thereby guaranteeing full independence of the validation samples, and compared with standard non-spatial leave-one-out cross-validation (LOO-CV). The results show relatively close statistical performances from one year to the next despite the differences between the annual time series. Good agreements between years were observed in monospecific tree plantations of broadleaf species versus high disparity in other forests composed of different species. A strong positive bias in the accuracy assessment (up to 0.4 of Overall Accuracy (OA)) was also found when spatial dependence in the validation data was not removed. Using the SLOO-CV approach, the average OA values per year ranged from 0.48 for 2006 to 0.60 for 2013, which satisfactorily represents the spatial instability of species prediction between years.

Author(s):  
Nicolas Karasiak ◽  
Jean-François Dejoux ◽  
Mathieu Fauvel ◽  
Jérôme Willm ◽  
Claude Monteil ◽  
...  

Mapping forest composition using multiseasonal optical time series is still challenging. Highly contrasted results are reported from one study to another suggesting that drivers of classification errors are still under-explored. We evaluated the performances of single-year Formosat-2 time series to discriminate tree species in temperate forests in France and investigated how predictions vary statistically and spatially across multiple years. Our objective was to better estimate the impact of spatial autocorrelation in the validation data on measurement accuracy and to understand which drivers in the time series are responsible for classification errors. The experiments were based on ten Formosat-2 image time series irregularly acquired during the seasonal vegetation cycle from 2006 to 2014. Due to lot of clouds in the year 2006, an alternative 2006 time series using only cloud-free images has been added. Thirteen tree species were classified in each single-year dataset based on the SVM algorithm. The performances were assessed using a spatial leave-one-out cross validation (SLOO-CV) strategy, thereby guaranteeing full independence of the validation samples, and compared with standard non-spatial leave-one-out cross-validation (LOO-CV). The results show relatively close statistical performances from one year to the next despite the differences between the annual time series. Good agreements between years were observed in monospecific tree plantations of broadleaf species versus high disparity in other forests composed of different species. A strong positive bias in the accuracy assessment (up to 0.4 of Overall Accuracy) was also found when spatial dependence in the validation data was not removed. Using the SLOO-CV approach, the average OA values per year ranged from 0.48 for 2006 to 0.60 for 2013, which satisfactorily represents the spatial instability of species prediction between years.


Author(s):  
N. Karasiak ◽  
M. Fauvel ◽  
J.-F. Dejoux ◽  
C. Monteil ◽  
D. Sheeren

Abstract. The free to use Sentinel-2 (S2) sensors with 5-day revisit time at high spatial resolution in 10 spectral bands is a revolution in the remote sensing domain. Including 6 spectral bands in the near infrared, with 3 dedicated for the red-edge (where the vegetation significatively increases), these european satellites are very promising for mapping tree species distribution at a national scale. Here, we study the contribution of three one-year S2 Satellite Image Time Series (SITS) for mapping deciduous species distribution in the southwest of France. The annual cycle of vegetation (called phenology) can contribute to the identification of tree species. For some specific dates, species can have different phenological behaviours (senesence, flowering…). To train and validate the maps, we used the Support Vector Machine algorithm with a spatial cross-validation method. To train the algorithm with the same number of samples per species, we decided to undersample each class to the smallest class using a K-means clustering method. Moreover, a Sequential Feature Selection (SFS) has been implemented to detect the optimal dates per species. Our results are promising with high accuracy for Red oak andWillow (average score of the three one-year respectively F1 = 0.99, F1 = 0.94) based on the optimal dates. However, it appears that the performances when using the each full SITS are far below the optimal dates models (average ΔF1 = 0.32). We did not find, except for Willow and Red oak, that the optimal dates were the same for each year. Perspectives is to find an algorithm robust to temporal or spectral noise and to smooth the time series.


2012 ◽  
Vol 229-231 ◽  
pp. 2276-2279
Author(s):  
Yu An Pan ◽  
Xuan Xiao ◽  
Pu Wang

Antimicrobial peptides (AMP) are potent, broad spectrum antibiotics which demonstrate potential as novel therapeutic agents. Because it is both time-consuming and laborious to identify new AMPs by experiment, this paper tries to resolve this problem by pattern recognition. Two major contents included: Firstly, up to six kinds of physicochemical properties value are selected to code the AMP sequence as physical-chemical property matrix (PCM), then auto and cross covariance transformation is performed to extract features from the PCM for AMP sequence expression; Secondly, these feature vectors are input to a powerful Support Vector Machine (SVM) classifier for training and new query AMP recognition. For a newly constructed AMP benchmark dataset, the overall classification accuracy about 96% has been achieved through the rigorous Leave-One-Out cross-validation. For convenience, a user-friendly web server, AMPpred, has been established at http://icpr.jci.jx.cn/bioinfo/AMPpred. It is anticipated that this on-line predictor may become a useful bioinformatics tool for molecular biology and drug development. Also, its novel approach will further stimulate the development of predicting peptide attributes.


2016 ◽  
Vol 8 (9) ◽  
pp. 734 ◽  
Author(s):  
David Sheeren ◽  
Mathieu Fauvel ◽  
Veliborka Josipović ◽  
Maïlys Lopes ◽  
Carole Planque ◽  
...  

2012 ◽  
Vol 554-556 ◽  
pp. 1628-1631 ◽  
Author(s):  
Tian Hong Gu ◽  
Wei Lv ◽  
Xia Shao ◽  
Wen Cong Lu

Based on the element contents of N, O, H and C of objects detected by γ-ray resonance, support vector classification (SVC) method was used to construct the model for distinguishing high energy materials (HEMs) from ordinary ones. It was found that the accuracy of prediction was 95.9% based on the leave-one-out cross validation (LOOCV) test. The results indicated that the performance of SVC model is good enough to detect HEMs in the presence of ordinary materials for the purpose of security checking.


2021 ◽  
Vol 25 (Special) ◽  
pp. 1-127-1-137
Author(s):  
Nibras Z. Salih ◽  
◽  
Walaa Khalaf ◽  

In the multiple instances learning framework, instances are arranged into bags, each bag contains several instances, the labels of each instance are not available but the label is available for each bag. Whilst in a single instance learning each instance is connected with the label that contains a single feature vector. This paper examines the distinction between these paradigms to see if it is appropriate, to cast the problem within a multiple instance framework. In single-instance learning, two datasets are applied (students’ dataset and iris dataset) using Naïve Bayes Classifier (NBC), Multilayer perceptron (MLP), Support Vector Machine (SVM), and Sequential Minimal Optimization (SMO), while SimpleMI, MIWrapper, and MIBoost in multiple instances learning. Leave One Out Cross-Validation (LOOCV), five and ten folds Cross-Validation techniques (5-CV, 10-CV) are implemented to evaluate the classification results. A comparison of the result of these techniques is made, several algorithms are found to be more effective for classification in the multiple instances learning. The suitable algorithms for the students' dataset are MIBoost with MLP for LOOCV with an accuracy of 75%, whereas SimpleMI with SMO for the iris dataset is the suitable algorithm for 10-CV with an accuracy of 99.33%.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4818
Author(s):  
Yaqing He ◽  
Kim Fung Tsang ◽  
Richard Yuen-Chong Kong ◽  
Yuk-Tak Chow

This paper introduces a novel model based on support vector machine with radial basis function kernel (RBF-SVM) using time-series features of zebrafish (Danio rerio) locomotion exposed to different electromagnetic fields (EMFs) to indicate the corresponding EMF exposure. A group of 14 adult zebrafish was randomly divided into two groups, 7 in each group; the fish of each group have the novel tank test under a sham or real magnetic exposure of 6.78 MHz and about 1 A/m. Their locomotion in the tests was videotaped to convert into the x, y coordinate time-series of the trajectories for reforming time-series matrices according to different time-series lengths. The time-series features of zebrafish locomotion were calculated by the comparative time-series analyzing framework highly comparative time-series analysis (HCTSA), and a limited number of the time-series features that were most relevant to the EMF exposure conditions were selected using the minimum redundancy maximum relevance (mRMR) algorithm for RBF-SVM classification training. Before this, ambient environmental parameters (AEPs) had little effect on the locomotion performance of zebrafish processed by the empirical method, which had been quantitatively verified by regression using another group of 14 adult zebrafish. The results have demonstrated that the purposed model is capable of accurately indicating different EMF exposures. All classification accuracies can be 100%, and the classification precision of several classifiers based on specific parameters and feature sets with specific dimensions can reach higher than 95%. The speculative reason for this result is that the specified EMF has affected the zebrafish neural aspect, which is then reflected in their behaviors. The outcomes of this study have provided a new indication model for EMF exposures and provided a reference for the investigation of the impact of EMF exposure.


Author(s):  
Xing Chen ◽  
Tian-Hao Li ◽  
Yan Zhao ◽  
Chun-Chun Wang ◽  
Chi-Chi Zhu

Abstract MicroRNA (miRNA) plays an important role in the occurrence, development, diagnosis and treatment of diseases. More and more researchers begin to pay attention to the relationship between miRNA and disease. Compared with traditional biological experiments, computational method of integrating heterogeneous biological data to predict potential associations can effectively save time and cost. Considering the limitations of the previous computational models, we developed the model of deep-belief network for miRNA-disease association prediction (DBNMDA). We constructed feature vectors to pre-train restricted Boltzmann machines for all miRNA-disease pairs and applied positive samples and the same number of selected negative samples to fine-tune DBN to obtain the final predicted scores. Compared with the previous supervised models that only use pairs with known label for training, DBNMDA innovatively utilizes the information of all miRNA-disease pairs during the pre-training process. This step could reduce the impact of too few known associations on prediction accuracy to some extent. DBNMDA achieves the AUC of 0.9104 based on global leave-one-out cross validation (LOOCV), the AUC of 0.8232 based on local LOOCV and the average AUC of 0.9048 ± 0.0026 based on 5-fold cross validation. These AUCs are better than other previous models. In addition, three different types of case studies for three diseases were implemented to demonstrate the accuracy of DBNMDA. As a result, 84% (breast neoplasms), 100% (lung neoplasms) and 88% (esophageal neoplasms) of the top 50 predicted miRNAs were verified by recent literature. Therefore, we could conclude that DBNMDA is an effective method to predict potential miRNA-disease associations.


2021 ◽  
Author(s):  
Fabiana Castino ◽  
Bodo Wichura ◽  
Harald Schellander ◽  
Michael Winkler

<p>The characterization of the snow cover by snow water equivalent (SWE) is fundamental in several environmental applications, e.g., monitoring mountain water resources or defining structural design standards. However, SWE observations are usually rare compared to other snow measurements as snow depth (HS). Therefore, model-based methods have been proposed in past studies for estimating SWE, in particular for short timescales (e.g., daily). In this study, we compare two different approaches for SWE-data modelling. The first approach, based on empirical regression models (ERMs), provides the regional parametrization of the bulk snow density, which can be used to estimate SWE values from HS. In particular, we investigate the performances of four different schemes based on previously developed ERMs of bulk snow density depending on HS, date, elevation, and location. Secondly, we apply the semi-empirical multi-layer Δsnow model, which estimates SWE solely based on snow depth observations. The open source Δsnow model has been recently used for deriving a snow load map for Austria, resulting in an improved Austrian standard. A large dataset of HS and SWE observations collected by the National Weather Service in Germany (DWD) is used for calibrating and validating the models. This dataset consists of daily HS and three-times-a-week SWE observations from in total ~1000 stations operated by DWD over the period from 1950 to 2020. A leave-one-out cross validation is applied to evaluate the performance of the different model approaches. It is based on 185 time series of HS and SWE observations that are representative of the diversity of the regional snow climatology of Germany. Cross validation reveals for all ERMs: 90% of the modelled SWE time series have a root mean square error (RMSE) and a bias lower than 45 kg/m² and 2 kg/m², respectively. The Δsnow model shows the best performance with 90% of the modelled SWE time series having an RMSE lower than 30 kg/m² and bias similar to the ERMs. This comparative study provides new insights on the reliability of model-based methods for estimating SWE values. The results show that the Δsnow model and, to a lower degree, the developed ERMs can provide satisfactory performances even on short timescales. This suggest that these models can be used as reliable alternative to more complex thermodynamic snow models, even more if long-term meteorological observations aside HS are scarce.</p>


2014 ◽  
Vol 136 (3) ◽  
Author(s):  
Jie Zhang ◽  
Souma Chowdhury ◽  
Ali Mehmani ◽  
Achille Messac

This paper investigates the characterization of the uncertainty in the prediction of surrogate models. In the practice of engineering, where predictive models are pervasively used, the knowledge of the level of modeling error in any region of the design space is uniquely helpful for design exploration and model improvement. The lack of methods that can explore the spatial variation of surrogate error levels in a wide variety of surrogates (i.e., model-independent methods) leaves an important gap in our ability to perform design domain exploration. We develop a novel framework, called domain segmentation based on uncertainty in the surrogate (DSUS) to segregate the design domain based on the level of local errors. The errors in the surrogate estimation are classified into physically meaningful classes based on the user's understanding of the system and/or the accuracy requirements for the concerned system analysis. The leave-one-out cross-validation technique is used to quantity the local errors. Support vector machine (SVM) is implemented to determine the boundaries between error classes, and to classify any new design point into the pertinent error class. We also investigate the effectiveness of the leave-one-out cross-validation technique in providing a local error measure, through comparison with actual local errors. The utility of the DSUS framework is illustrated using two different surrogate modeling methods: (i) the Kriging method and (ii) the adaptive hybrid functions (AHF). The DSUS framework is applied to a series of standard test problems and engineering problems. In these case studies, the DSUS framework is observed to provide reasonable accuracy in classifying the design-space based on error levels. More than 90% of the test points are accurately classified into the appropriate error classes.


Sign in / Sign up

Export Citation Format

Share Document