scholarly journals The Holdout Randomization Test for Feature Selection in Black Box Models

Author(s):  
Wesley Tansey ◽  
Victor Veitch ◽  
Haoran Zhang ◽  
Raul Rabadan ◽  
David M. Blei
Energies ◽  
2021 ◽  
Vol 14 (23) ◽  
pp. 7865
Author(s):  
Saeid Shahpouri ◽  
Armin Norouzi ◽  
Christopher Hayduk ◽  
Reza Rezaei ◽  
Mahdi Shahbakhti ◽  
...  

The standards for emissions from diesel engines are becoming more stringent and accurate emission modeling is crucial in order to control the engine to meet these standards. Soot emissions are formed through a complex process and are challenging to model. A comprehensive analysis of diesel engine soot emissions modeling for control applications is presented in this paper. Physical, black-box, and gray-box models are developed for soot emissions prediction. Additionally, different feature sets based on the least absolute shrinkage and selection operator (LASSO) feature selection method and physical knowledge are examined to develop computationally efficient soot models with good precision. The physical model is a virtual engine modeled in GT-Power software that is parameterized using a portion of experimental data. Different machine learning methods, including Regression Tree (RT), Ensemble of Regression Trees (ERT), Support Vector Machines (SVM), Gaussian Process Regression (GPR), Artificial Neural Network (ANN), and Bayesian Neural Network (BNN) are used to develop the black-box models. The gray-box models include a combination of the physical and black-box models. A total of five feature sets and eight different machine learning methods are tested. An analysis of the accuracy, training time and test time of the models is performed using the K-means clustering algorithm. It provides a systematic way for categorizing the feature sets and methods based on their performance and selecting the best method for a specific application. According to the analysis, the black-box model consisting of GPR and feature selection by LASSO shows the best performance with test R2 of 0.96. The best gray-box model consists of SVM-based method with physical insight feature set along with LASSO for feature selection with test R2 of 0.97.


Energies ◽  
2020 ◽  
Vol 13 (24) ◽  
pp. 6749
Author(s):  
Reda El Bechari ◽  
Stéphane Brisset ◽  
Stéphane Clénet ◽  
Frédéric Guyomarch ◽  
Jean Claude Mipo

Metamodels proved to be a very efficient strategy for optimizing expensive black-box models, e.g., Finite Element simulation for electromagnetic devices. It enables the reduction of the computational burden for optimization purposes. However, the conventional approach of using metamodels presents limitations such as the cost of metamodel fitting and infill criteria problem-solving. This paper proposes a new algorithm that combines metamodels with a branch and bound (B&B) strategy. However, the efficiency of the B&B algorithm relies on the estimation of the bounds; therefore, we investigated the prediction error given by metamodels to predict the bounds. This combination leads to high fidelity global solutions. We propose a comparison protocol to assess the approach’s performances with respect to those of other algorithms of different categories. Then, two electromagnetic optimization benchmarks are treated. This paper gives practical insights into algorithms that can be used when optimizing electromagnetic devices.


We provide a framework for investment managers to create dynamic pretrade models. The approach helps market participants shed light on vendor black-box models that often do not provide any transparency into the model’s functional form or working mechanics. In addition, this allows portfolio managers to create consensus estimates based on their own expectations, such as forecasted liquidity and volatility, and to incorporate firm proprietary alpha estimates into the solution. These techniques allow managers to reduce overdependency on any one black-box model, incorporate costs into the stock selection and portfolio optimization phase of the investment cycle, and perform “what-if” and sensitivity analyses without the risk of information leakage to any outside party or vendor.


Author(s):  
Kacper Sokol ◽  
Peter Flach

Understanding data, models and predictions is important for machine learning applications. Due to the limitations of our spatial perception and intuition, analysing high-dimensional data is inherently difficult. Furthermore, black-box models achieving high predictive accuracy are widely used, yet the logic behind their predictions is often opaque. Use of textualisation -- a natural language narrative of selected phenomena -- can tackle these shortcomings. When extended with argumentation theory we could envisage machine learning models and predictions arguing persuasively for their choices.


Author(s):  
Marjan Popov ◽  
Bjørn Gustavsen ◽  
Juan A. Martinez-Velasco

Voltage surges arising from transient events, such as switching operations or lightning discharges, are one of the main causes of transformer winding failure. The voltage distribution along a transformer winding depends greatly on the waveshape of the voltage applied to the winding. This distribution is not uniform in the case of steep-fronted transients since a large portion of the applied voltage is usually concentrated on the first few turns of the winding. High frequency electromagnetic transients in transformers can be studied using internal models (i.e., models for analyzing the propagation and distribution of the incident impulse along the transformer windings), and black-box models (i.e., models for analyzing the response of the transformer from its terminals and for calculating voltage transfer). This chapter presents a summary of the most common models developed for analyzing the behaviour of transformers subjected to steep-fronted waves and a description of procedures for determining the parameters to be specified in those models. The main section details some test studies based on actual transformers in which models are validated by comparing simulation results to laboratory measurements.


Author(s):  
Evren Daglarli

Today, the effects of promising technologies such as explainable artificial intelligence (xAI) and meta-learning (ML) on the internet of things (IoT) and the cyber-physical systems (CPS), which are important components of Industry 4.0, are increasingly intensified. However, there are important shortcomings that current deep learning models are currently inadequate. These artificial neural network based models are black box models that generalize the data transmitted to it and learn from the data. Therefore, the relational link between input and output is not observable. For these reasons, it is necessary to make serious efforts on the explanability and interpretability of black box models. In the near future, the integration of explainable artificial intelligence and meta-learning approaches to cyber-physical systems will have effects on a high level of virtualization and simulation infrastructure, real-time supply chain, cyber factories with smart machines communicating over the internet, maximizing production efficiency, analysis of service quality and competition level.


Sign in / Sign up

Export Citation Format

Share Document