scholarly journals Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance

2021 ◽  
Vol 31 (6) ◽  
Author(s):  
Giles Hooker ◽  
Lucas Mentch ◽  
Siyu Zhou

AbstractThis paper reviews and advocates against the use of permute-and-predict (PaP) methods for interpreting black box functions. Methods such as the variable importance measures proposed for random forests, partial dependence plots, and individual conditional expectation plots remain popular because they are both model-agnostic and depend only on the pre-trained model output, making them computationally efficient and widely available in software. However, numerous studies have found that these tools can produce diagnostics that are highly misleading, particularly when there is strong dependence among features. The purpose of our work here is to (i) review this growing body of literature, (ii) provide further demonstrations of these drawbacks along with a detailed explanation as to why they occur, and (iii) advocate for alternative measures that involve additional modeling. In particular, we describe how breaking dependencies between features in hold-out data places undue emphasis on sparse regions of the feature space by forcing the original model to extrapolate to regions where there is little to no data. We explore these effects across various model setups and find support for previous claims in the literature that PaP metrics can vastly over-emphasize correlated features in both variable importance measures and partial dependence plots. As an alternative, we discuss and recommend more direct approaches that involve measuring the change in model performance after muting the effects of the features under investigation.

Author(s):  
Tammy Jiang ◽  
Jaimie L Gradus ◽  
Timothy L Lash ◽  
Matthew P Fox

Abstract Although variables are often measured with error, the impact of measurement error on machine learning predictions is seldom quantified. The purpose of this study was to assess the impact of measurement error on random forest model performance and variable importance. First, we assessed the impact of misclassification (i.e., measurement error of categorical variables) of predictors on random forest model performance (e.g., accuracy, sensitivity) and variable importance (mean decrease in accuracy) using data from the United States National Comorbidity Survey Replication (2001 - 2003). Second, we simulated datasets in which we know the true model performance and variable importance measures and could verify that quantitative bias analysis was recovering the truth in misclassified versions of the datasets. Our findings show that measurement error in the data used to construct random forests can distort model performance and variable importance measures, and that bias analysis can recover the correct results. This study highlights the utility of applying quantitative bias analysis in machine learning to quantify the impact of measurement error on study results.


2015 ◽  
Vol 15 (12) ◽  
pp. 2703-2713 ◽  
Author(s):  
C. Melchiorre ◽  
A. Tryggvason

Abstract. We refine and test an algorithm for landslide susceptibility assessment in areas with sensitive clays. The algorithm uses soil data and digital elevation models to identify areas which may be prone to landslides and has been applied in Sweden for several years. The algorithm is very computationally efficient and includes an intelligent filtering procedure for identifying and removing small-scale artifacts in the hazard maps produced. Where information on bedrock depth is available, this can be included in the analysis, as can information on several soil-type-based cross-sectional angle thresholds for slip. We evaluate how processing choices such as of filtering parameters, local cross-sectional angle thresholds, and inclusion of bedrock depth information affect model performance. The specific cross-sectional angle thresholds used were derived by analyzing the relationship between landslide scarps and the quick-clay susceptibility index (QCSI). We tested the algorithm in the Göta River valley. Several different verification measures were used to compare results with observed landslides and thereby identify the optimal algorithm parameters. Our results show that even though a relationship between the cross-sectional angle threshold and the QCSI could be established, no significant improvement of the overall modeling performance could be achieved by using these geographically specific, soil-based thresholds. Our results indicate that lowering the cross-sectional angle threshold from 1 : 10 (the general value used in Sweden) to 1 : 13 improves results slightly. We also show that an application of the automatic filtering procedure that removes areas initially classified as prone to landslides not only removes artifacts and makes the maps visually more appealing, but it also improves the model performance.


2021 ◽  
Vol 9 ◽  
Author(s):  
Xiangwan Fu ◽  
Mingzhu Tang ◽  
Dongqun Xu ◽  
Jun Yang ◽  
Donglin Chen ◽  
...  

Aiming at the problem of difficulties in modeling the nonlinear relation in the steam coal dataset, this article proposes a forecasting method for the price of steam coal based on robust regularized kernel regression and empirical mode decomposition. By selecting the polynomial kernel function, the robust loss function and L2 regular term to construct a robust regularized kernel regression model are used. The polynomial kernel function does not depend on the kernel parameters and can mine the global rules in the dataset so that improves the forecasting stability of the kernel model. This method maps the features to the high-dimensional space by using the polynomial kernel function to transform the nonlinear law in the original feature space into linear law in the high-dimensional space and helps learn the linear law in the high-dimensional feature space by using the linear model. The Huber loss function is selected to reduce the influence of abnormal noise in the dataset on the model performance, and the L2 regular term is used to reduce the risk of model overfitting. We use the combined model based on empirical mode decomposition (EMD) and auto regressive integrated moving average (ARIMA) model to compensate for the error of robust regularized kernel regression model, thus making up for the limitations of the single forecasting model. Finally, we use the steam coal dataset to verify the proposed model and such model has an optimal evaluation index value compared to other contrast models after the model performance is evaluated as per the evaluation index such as RMSE, MAE, and mean absolute percentage error.


Author(s):  
Benjamin A Goldstein ◽  
Eric C Polley ◽  
Farren B. S. Briggs

The Random Forests (RF) algorithm has become a commonly used machine learning algorithm for genetic association studies. It is well suited for genetic applications since it is both computationally efficient and models genetic causal mechanisms well. With its growing ubiquity, there has been inconsistent and less than optimal use of RF in the literature. The purpose of this review is to breakdown the theoretical and statistical basis of RF so that practitioners are able to apply it in their work. An emphasis is placed on showing how the various components contribute to bias and variance, as well as discussing variable importance measures. Applications specific to genetic studies are highlighted. To provide context, RF is compared to other commonly used machine learning algorithms.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Jimmy C. Azar ◽  
Martin Simonsson ◽  
Ewert Bengtsson ◽  
Anders Hast

Comparing staining patterns of paired antibodies designed towards a specific protein but toward different epitopes of the protein provides quality control over the binding and the antibodies’ ability to identify the target protein correctly and exclusively. We present a method for automated quantification of immunostaining patterns for antibodies in breast tissue using the Human Protein Atlas database. In such tissue, dark brown dye 3,3′-diaminobenzidine is used as an antibody-specific stain whereas the blue dye hematoxylin is used as a counterstain. The proposed method is based on clustering and relative scaling of features following principal component analysis. Our method is able (1) to accurately segment and identify staining patterns and quantify the amount of staining and (2) to detect paired antibodies by correlating the segmentation results among different cases. Moreover, the method is simple, operating in a low-dimensional feature space, and computationally efficient which makes it suitable for high-throughput processing of tissue microarrays.


Author(s):  
Xin Wu ◽  
Yaoyu Li

When an air compressor is operated at very low flow rate for a given discharge pressure, surge may occur, resulting in large oscillations in pressure and flow in the compressor. To prevent the damage of the compressor, on account of surge, the control strategy employed is typically to operate it below the surge line (a map of the conditions at which surge begins). Surge line is strongly affected by the ambient air conditions. Previous research has developed to derive data-driven surge maps based on an asymmetric support vector machine (ASVM). The ASVM penalizes the surge case with much greater cost to minimize the possibility of undetected surge. This paper concerns the development of adaptive ASVM based self-learning surge map modeling via the combination with signal processing techniques for surge detection. During the actual operation of a compressor after the ASVM based surge map is obtained with historic data, new surge points can be identified with the surge detection methods such as short-time Fourier transform or wavelet transform. The new surge point can be used to update the surge map. However, with increasing number of surge points, the complexity of support vector machine (SVM) would grow dramatically. In order to keep the surge map SVM at a relatively low dimension, an adaptive SVM modeling algorithm is developed to select the minimum set of necessary support vectors in a three-dimension feature space based on Gaussian curvature to guarantee a desirable classification between surge and nonsurge areas. The proposed method is validated by applying the surge test data obtained from a testbed compressor at a manufacturing plant.


2021 ◽  
Author(s):  
Anne Tryphosa Kamatham ◽  
Meena Alzamani ◽  
Allison Dockum ◽  
Siddhartha Sikdar ◽  
Biswarup Mukherjee

Noninvasive methods for estimation of joint and muscle forces have widespread clinical and research applications. Surface electromyography or sEMG provides a measure of the neural activation of muscles which can be used to estimate the force produced by the muscle. However, sEMG based measures of force suffer from poor signal-to-noise ratio and limited spatiotemporal specificity. In this paper, we propose an ultrasound imaging or sonomyography-based approach for estimating continuous isometric force from a sparse set of ultrasound scanlines. Our approach isolates anatomically relevant features from A-mode ultrasound signals, greatly reducing the dimensionality of the feature space and the computational complexity involved in traditional ultrasound-based methods. We evaluate the performance of four regression methodologies for force prediction using the reduced feature set. We also evaluate the feasibility of a practical wearable sonomyography-based system by simulating the effect of transducer placement and varying the number of transducers used in force prediction. Our results demonstrate that Gaussian process regression models outperform other regression methods in predicting continuous force levels from just four equispaced transducers and are tolerant to speckle noise. These findings will aid in the design of wearable sonomyography-based force prediction systems with robust, computationally efficient operation.


2016 ◽  
Author(s):  
Kathleen A. Mar ◽  
Narendra Ojha ◽  
Andrea Pozzer ◽  
Tim M. Butler

Abstract. We present an evaluation of the online regional model WRF-Chem over Europe with a focus on ground-level ozone (O3) and nitrogen oxides (NOx). The model performance is evaluated for two chemical mechanisms, MOZART-4 and RADM2, for year-long simulations. Model-predicted surface meteorological variables (e.g., temperature, wind speed and direction) compared well overall with surface-based observations, consistent with other WRF studies. WRF-Chem simulations employing MOZART-4 as well as RADM2 chemistry were found to reproduce the observed spatial variability in surface ozone over Europe. However, the absolute O3 concentrations predicted by the two chemical mechanisms were found to be quite different, with MOZART-4 predicting O3 concentrations up to 20 μg m−3 greater than RADM2 in summer. Compared to observations, MOZART-4 chemistry overpredicted O3 concentrations for most of Europe in the summer and fall, with a summertime domain-wide mean bias of +10 μg m−3 against observations from the AirBase network. In contrast, RADM2 chemistry generally led to an underestimation of O3 over the European domain in all seasons. We found that the use of the MOZART-4 mechanism, evaluated here for the first time for a European domain, led to lower absolute biases than RADM2 when compared to ground-based observations. The two mechanisms show relatively similar behavior for NOx, with both MOZART-4 and RADM2 resulting in a slight underestimation of NOx compared to surface observations. Further investigation into the differences between the two mechanisms revealed that the net midday photochemical production rate of O3 in summer is higher for MOZART-4 than for RADM2 for most of the domain. The largest differences in O3 production can be seen over Germany, where net O3 production in MOZART-4 is seen to be higher than in RADM2 by 1.8 ppb hr−1 (3.6 μg m−3 hr−1) or more. We also show that, while the two mechanisms exhibit similar NOx-sensitivity, RADM2 is approximately twice as sensitive to increases in anthropogenic VOC emissions as MOZART-4. Additionally, we found that differences in reaction rate constants for inorganic gas phase chemistry in MOZART-4 vs. RADM2 accounted for a difference of 8 μg m−3 in O3 predicted by the two mechanisms, whereas differences in deposition and photolysis schemes explained smaller differences in in O3. Our results highlight the strong dependence of modeled surface O3 over Europe on the choice of gas phase chemical mechanism, which we discuss in the context of overall uncertainties in prediction of ground-level O3 and its associated health impacts (via the health-related metrics MDA8 and SOMO35).


Sign in / Sign up

Export Citation Format

Share Document