Characterizing Uncertainty Attributable to Surrogate Models

2014 ◽  
Vol 136 (3) ◽  
Author(s):  
Jie Zhang ◽  
Souma Chowdhury ◽  
Ali Mehmani ◽  
Achille Messac

This paper investigates the characterization of the uncertainty in the prediction of surrogate models. In the practice of engineering, where predictive models are pervasively used, the knowledge of the level of modeling error in any region of the design space is uniquely helpful for design exploration and model improvement. The lack of methods that can explore the spatial variation of surrogate error levels in a wide variety of surrogates (i.e., model-independent methods) leaves an important gap in our ability to perform design domain exploration. We develop a novel framework, called domain segmentation based on uncertainty in the surrogate (DSUS) to segregate the design domain based on the level of local errors. The errors in the surrogate estimation are classified into physically meaningful classes based on the user's understanding of the system and/or the accuracy requirements for the concerned system analysis. The leave-one-out cross-validation technique is used to quantity the local errors. Support vector machine (SVM) is implemented to determine the boundaries between error classes, and to classify any new design point into the pertinent error class. We also investigate the effectiveness of the leave-one-out cross-validation technique in providing a local error measure, through comparison with actual local errors. The utility of the DSUS framework is illustrated using two different surrogate modeling methods: (i) the Kriging method and (ii) the adaptive hybrid functions (AHF). The DSUS framework is applied to a series of standard test problems and engineering problems. In these case studies, the DSUS framework is observed to provide reasonable accuracy in classifying the design-space based on error levels. More than 90% of the test points are accurately classified into the appropriate error classes.

This article presented in the context of 2D global facial recognition, using Gabor Wavelet's feature extraction algorithms, and facial recognition Support Vector Machines (SVM), the latter incorporating the kernel functions: linear, cubic and Gaussian. The models generated by these kernels were validated by the cross validation technique through the Matlab application. The objective is to observe the results of facial recognition in each case. An efficient technique is proposed that includes the mentioned algorithms for a database of 2D images. The technique has been processed in its training and testing phases, for the facial image databases FERET [1] and MUCT [2], and the models generated by the technique allowed to perform the tests, whose results achieved a facial recognition of individuals over 96%.


Mekatronika ◽  
2021 ◽  
Vol 3 (1) ◽  
pp. 27-31
Author(s):  
Ken-ji Ee ◽  
Ahmad Fakhri Bin Ab. Nasir ◽  
Anwar P. P. Abdul Majeed ◽  
Mohd Azraai Mohd Razman ◽  
Nur Hafieza Ismail

The animal classification system is a technology to classify the animal class (type) automatically and useful in many applications. There are many types of learning models applied to this technology recently. Nonetheless, it is worth noting that the extraction of the features and the classification of the animal features is non-trivial, particularly in the deep learning approach for a successful animal classification system. The use of Transfer Learning (TL) has been demonstrated to be a powerful tool in the extraction of essential features. However, the employment of such a method towards animal classification applications are somewhat limited. The present study aims to determine a suitable TL-conventional classifier pipeline for animal classification. The VGG16 and VGG19 were used in extracting features and then coupled with either k-Nearest Neighbour (k-NN) or Support Vector Machine (SVM) classifier. Prior to that, a total of 4000 images were gathered consisting of a total of five classes which are cows, goats, buffalos, dogs, and cats. The data was split into the ratio of 80:20 for train and test. The classifiers hyper parameters are tuned by the Grids Search approach that utilises the five-fold cross-validation technique. It was demonstrated from the study that the best TL pipeline identified is the VGG16 along with an optimised SVM, as it was able to yield an average classification accuracy of 0.975. The findings of the present investigation could facilitate animal classification application, i.e. for monitoring animals in wildlife.


Author(s):  
Dohyun Park ◽  
Yongbin Lee ◽  
Dong-Hoon Choi

Many meta-models have been developed to approximate true responses. These meta-models are often used for optimization instead of computer simulations which require high computational cost. However, designers do not know which meta-model is the best one in advance because the accuracy of each meta-model becomes different from problem to problem. To address this difficulty, research on the ensemble of meta-models that combines stand-alone meta-models has recently been pursued with the expectation of improving the prediction accuracy. In this study, we propose a selection method of weight factors for the ensemble of meta-models based on v-nearest neighbors’ cross-validation error (CV). The four stand-alone meta-models we employed in this study are polynomial regression, Kriging, radial basis function, and support vector regression. Each method is applied to five 1-D mathematical examples and ten 2-D mathematical examples. The prediction accuracy of each stand-alone meta-model and the existing ensemble of meta-models is compared. Ensemble of meta-models shows higher accuracy than the worst stand-alone model among the four stand-alone meta-models at all test examples (30 cases). In addition, the ensemble of meta-models shows the highest accuracy for the 5 test cases. Although it has lower accuracy than the best stand-alone meta-model, it has almost same RMSE values (less than 1.1) as the best standalone model in 16 out of 30 test cases. From the results, we can conclude that proposed method is effective and robust.


2019 ◽  
Vol 11 (21) ◽  
pp. 2512 ◽  
Author(s):  
Nicolas Karasiak ◽  
Jean-François Dejoux ◽  
Mathieu Fauvel ◽  
Jérôme Willm ◽  
Claude Monteil ◽  
...  

Mapping forest composition using multiseasonal optical time series remains a challenge. Highly contrasted results are reported from one study to another suggesting that drivers of classification errors are still under-explored. We evaluated the performances of single-year Formosat-2 time series to discriminate tree species in temperate forests in France and investigated how predictions vary statistically and spatially across multiple years. Our objective was to better estimate the impact of spatial autocorrelation in the validation data on measurement accuracy and to understand which drivers in the time series are responsible for classification errors. The experiments were based on 10 Formosat-2 image time series irregularly acquired during the seasonal vegetation cycle from 2006 to 2014. Due to lot of clouds in the year 2006, an alternative 2006 time series using only cloud-free images has been added. Thirteen tree species were classified in each single-year dataset based on the Support Vector Machine (SVM) algorithm. The performances were assessed using a spatial leave-one-out cross validation (SLOO-CV) strategy, thereby guaranteeing full independence of the validation samples, and compared with standard non-spatial leave-one-out cross-validation (LOO-CV). The results show relatively close statistical performances from one year to the next despite the differences between the annual time series. Good agreements between years were observed in monospecific tree plantations of broadleaf species versus high disparity in other forests composed of different species. A strong positive bias in the accuracy assessment (up to 0.4 of Overall Accuracy (OA)) was also found when spatial dependence in the validation data was not removed. Using the SLOO-CV approach, the average OA values per year ranged from 0.48 for 2006 to 0.60 for 2013, which satisfactorily represents the spatial instability of species prediction between years.


2012 ◽  
Vol 229-231 ◽  
pp. 2276-2279
Author(s):  
Yu An Pan ◽  
Xuan Xiao ◽  
Pu Wang

Antimicrobial peptides (AMP) are potent, broad spectrum antibiotics which demonstrate potential as novel therapeutic agents. Because it is both time-consuming and laborious to identify new AMPs by experiment, this paper tries to resolve this problem by pattern recognition. Two major contents included: Firstly, up to six kinds of physicochemical properties value are selected to code the AMP sequence as physical-chemical property matrix (PCM), then auto and cross covariance transformation is performed to extract features from the PCM for AMP sequence expression; Secondly, these feature vectors are input to a powerful Support Vector Machine (SVM) classifier for training and new query AMP recognition. For a newly constructed AMP benchmark dataset, the overall classification accuracy about 96% has been achieved through the rigorous Leave-One-Out cross-validation. For convenience, a user-friendly web server, AMPpred, has been established at http://icpr.jci.jx.cn/bioinfo/AMPpred. It is anticipated that this on-line predictor may become a useful bioinformatics tool for molecular biology and drug development. Also, its novel approach will further stimulate the development of predicting peptide attributes.


2012 ◽  
Vol 554-556 ◽  
pp. 1628-1631 ◽  
Author(s):  
Tian Hong Gu ◽  
Wei Lv ◽  
Xia Shao ◽  
Wen Cong Lu

Based on the element contents of N, O, H and C of objects detected by γ-ray resonance, support vector classification (SVC) method was used to construct the model for distinguishing high energy materials (HEMs) from ordinary ones. It was found that the accuracy of prediction was 95.9% based on the leave-one-out cross validation (LOOCV) test. The results indicated that the performance of SVC model is good enough to detect HEMs in the presence of ordinary materials for the purpose of security checking.


2021 ◽  
Vol 25 (Special) ◽  
pp. 1-127-1-137
Author(s):  
Nibras Z. Salih ◽  
◽  
Walaa Khalaf ◽  

In the multiple instances learning framework, instances are arranged into bags, each bag contains several instances, the labels of each instance are not available but the label is available for each bag. Whilst in a single instance learning each instance is connected with the label that contains a single feature vector. This paper examines the distinction between these paradigms to see if it is appropriate, to cast the problem within a multiple instance framework. In single-instance learning, two datasets are applied (students’ dataset and iris dataset) using Naïve Bayes Classifier (NBC), Multilayer perceptron (MLP), Support Vector Machine (SVM), and Sequential Minimal Optimization (SMO), while SimpleMI, MIWrapper, and MIBoost in multiple instances learning. Leave One Out Cross-Validation (LOOCV), five and ten folds Cross-Validation techniques (5-CV, 10-CV) are implemented to evaluate the classification results. A comparison of the result of these techniques is made, several algorithms are found to be more effective for classification in the multiple instances learning. The suitable algorithms for the students' dataset are MIBoost with MLP for LOOCV with an accuracy of 75%, whereas SimpleMI with SMO for the iris dataset is the suitable algorithm for 10-CV with an accuracy of 99.33%.


2018 ◽  
Vol 140 (4) ◽  
Author(s):  
Xueguan Song ◽  
Liye Lv ◽  
Jieling Li ◽  
Wei Sun ◽  
Jie Zhang

Hybrid or ensemble surrogate models developed in recent years have shown a better accuracy compared to individual surrogate models. However, it is still challenging for hybrid surrogate models to always meet the accuracy, robustness, and efficiency requirements for many specific problems. In this paper, an advanced hybrid surrogate model, namely, extended adaptive hybrid functions (E-AHF), is developed, which consists of two major components. The first part automatically filters out the poorly performing individual models and remains the appropriate ones based on the leave-one-out (LOO) cross-validation (CV) error. The second part calculates the adaptive weight factors for each individual surrogate model based on the baseline model and the estimated mean square error in a Gaussian process prediction. A large set of numerical experiments consisting of up to 40 test problems from one dimension to 16 dimensions are used to verify the accuracy and robustness of the proposed model. The results show that both the accuracy and the robustness of E-AHF have been remarkably improved compared with the individual surrogate models and multiple benchmark hybrid surrogate models. The computational time of E-AHF has also been considerately reduced compared with other hybrid models.


2012 ◽  
Vol 542-543 ◽  
pp. 1438-1442
Author(s):  
Ting Hua Wang ◽  
Cai Yun Cai ◽  
Yan Liao

Kernel is a key component of the support vector machines (SVMs) and other kernel methods. Based on the data distributions of classes in the feature space, this paper proposed a model selection criterion to evaluate the goodness of a kernel in multiclass classification scenario. This criterion is computationally efficient and is differentiable with respect to the kernel parameters. Compared with the k-fold cross validation technique which is often regarded as a benchmark, this criterion is found to yield about the same performance with much less computational overhead.


Sign in / Sign up

Export Citation Format

Share Document