Towards Interpretation of Pairwise Learning

Recently, there are increasingly more attentions paid to an important family of learning problems called pairwise learning, in which the associated loss functions depend on pairs of instances. Despite the tremendous success of pairwise learning in many real-world applications, the lack of transparency behind the learned pairwise models makes it difficult for users to understand how particular decisions are made by these models, which further impedes users from trusting the predicted results. To tackle this problem, in this paper, we study feature importance scoring as a specific approach to the problem of interpreting the predictions of black-box pairwise models. Specifically, we first propose a novel adaptive Shapley-value-based interpretation method, based on which a vector of importance scores associated with the underlying features of a testing instance pair can be adaptively calculated with the consideration of feature correlations, and these scores can be used to indicate which features make key contributions to the final prediction. Considering that Shapley-value-based methods are usually computationally challenging, we further propose a novel robust approximation interpretation method for pairwise models. This method is not only much more efficient but also robust to data noise. To the best of our knowledge, we are the first to investigate how to enable interpretation in pairwise learning. Theoretical analysis and extensive experiments demonstrate the effectiveness of the proposed methods.

Download Full-text

Cost-sensitive design of error correcting output codes

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1177/0954406217709303 ◽

2017 ◽

Vol 232 (10) ◽

pp. 1871-1881

Author(s):

Aijun Xue ◽

Xiaodan Wang

Keyword(s):

Classification Problem ◽

Learning Problems ◽

Classification Problems ◽

Cost Sensitive Learning ◽

Misclassification Costs ◽

Real World Applications ◽

Cost Sensitive Classification ◽

Comparable Performance ◽

The Given ◽

Error Correcting Output Codes

Many real world applications involve multiclass cost-sensitive learning problems. However, some well-worked binary cost-sensitive learning algorithms cannot be extended into multiclass cost-sensitive learning directly. It is meaningful to decompose the complex multiclass cost-sensitive classification problem into a series of binary cost-sensitive classification problems. So, in this paper we propose an alternative and efficient decomposition framework, using the original error correcting output codes. The main problem in our framework is how to evaluate the binary costs for each binary cost-sensitive base classifier. To solve this problem, we proposed to compute the expected misclassification costs starting from the given multiclass cost matrix. Furthermore, the general formulations to compute the binary costs are given. Experimental results on several synthetic and UCI datasets show that our method can obtain comparable performance in comparison with the state-of-the-art methods.

Download Full-text

Supplementary material to "Gaining Hydrological Insights Through Wilk's Feature Importance: A Test-Statistic Interpretation method for Reliable and Robust Inference"

10.5194/hess-2021-65-supplement ◽

2021 ◽

Author(s):

Kailong Li ◽

Guohe Huang ◽

Brian Baetz

Keyword(s):

Robust Inference ◽

Test Statistic ◽

Feature Importance ◽

Supplementary Material ◽

Interpretation Method

Download Full-text

Permutation-based Identification of Important Biomarkers for Complex Diseases via Black-box Models

10.1101/2020.04.27.064170 ◽

2020 ◽

Author(s):

Xinlei Mi ◽

Baiming Zou ◽

Fei Zou ◽

Jianhua Hu

Keyword(s):

Human Disease ◽

Molecular Mechanisms ◽

Black Box ◽

The Cancer Genome Atlas ◽

Human Diseases ◽

Support Vector ◽

Individual Feature ◽

Box Models ◽

Feature Importance ◽

Black Box Models

AbstractStudy of human disease remains challenging due to convoluted disease etiologies and complex molecular mechanisms at genetic, genomic, and proteomic levels. Many machine learning-based methods, including deep learning and random forest, have been developed and widely used to alleviate some analytic challenges in complex human disease studies. While enjoying the modeling flexibility and robustness, these model frameworks suffer from non-transparency and difficulty in interpreting the role of each individual feature due to their intrinsic black-box natures. However, identifying important biomarkers associated with complex human diseases is a critical pursuit towards assisting researchers to establish novel hypotheses regarding prevention, diagnosis and treatment of complex human diseases. Herein, we propose a Permutation-based Feature Importance Test (PermFIT) for estimating and testing the feature importance, and for assisting interpretation of individual feature in various black-box frameworks, including deep neural networks, random forests, and support vector machines. PermFIT (available at https://github.com/SkadiEye/deepTL) is implemented in a computationally efficient manner, without model refitting for each permuted data. We conduct extensive numerical studies under various scenarios, and show that PermFIT not only yields valid statistical inference, but also helps to improve the prediction accuracy of black-box models with top selected features. With the application to the Cancer Genome Atlas (TCGA) kidney tumor data and the HITChip atlas BMI data, PermFIT clearly demonstrates its practical usage in identifying important biomarkers and boosting performance of black-box predictive models.

Download Full-text

Optimizing for Interpretability in Deep Neural Networks with Tree Regularization

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12558 ◽

2021 ◽

Vol 72 ◽

pp. 1-37

Author(s):

Mike Wu ◽

Sonali Parbhoo ◽

Michael C. Hughes ◽

Volker Roth ◽

Finale Doshi-Velez

Keyword(s):

Neural Networks ◽

Predictive Power ◽

Deep Neural Networks ◽

Large Body ◽

Black Box ◽

New Family ◽

Deep Model ◽

Real World Applications ◽

Adversarial Examples ◽

Key Barrier

Deep models have advanced prediction in many domains, but their lack of interpretability remains a key barrier to the adoption in many real world applications. There exists a large body of work aiming to help humans understand these black box functions to varying levels of granularity – for example, through distillation, gradients, or adversarial examples. These methods however, all tackle interpretability as a separate process after training. In this work, we take a different approach and explicitly regularize deep models so that they are well-approximated by processes that humans can step through in little time. Specifically, we train several families of deep neural networks to resemble compact, axis-aligned decision trees without significant compromises in accuracy. The resulting axis-aligned decision functions uniquely make tree regularized models easy for humans to interpret. Moreover, for situations in which a single, global tree is a poor estimator, we introduce a regional tree regularizer that encourages the deep model to resemble a compact, axis-aligned decision tree in predefined, human-interpretable contexts. Using intuitive toy examples, benchmark image datasets, and medical tasks for patients in critical care and with HIV, we demonstrate that this new family of tree regularizers yield models that are easier for humans to simulate than L1 or L2 penalties without sacrificing predictive power.

Download Full-text

PAC-Bayes Unleashed: Generalisation Bounds with Unbounded Losses

Entropy ◽

10.3390/e23101330 ◽

2021 ◽

Vol 23 (10) ◽

pp. 1330

Author(s):

Maxime Haddouche ◽

Benjamin Guedj ◽

Omar Rivasplata ◽

John Shawe-Taylor

Keyword(s):

Linear Regression ◽

Supervised Learning ◽

Loss Function ◽

Loss Functions ◽

Learning Problems ◽

Regression Problem ◽

Learning Framework ◽

Actual Computation ◽

Linear Regression Problem

We present new PAC-Bayesian generalisation bounds for learning problems with unbounded loss functions. This extends the relevance and applicability of the PAC-Bayes learning framework, where most of the existing literature focuses on supervised learning problems with a bounded loss function (typically assumed to take values in the interval [0;1]). In order to relax this classical assumption, we propose to allow the range of the loss to depend on each predictor. This relaxation is captured by our new notion of HYPothesis-dependent rangE (HYPE). Based on this, we derive a novel PAC-Bayesian generalisation bound for unbounded loss functions, and we instantiate it on a linear regression problem. To make our theory usable by the largest audience possible, we include discussions on actual computation, practicality and limitations of our assumptions.

Download Full-text

Unsupervised Quality Estimation for Neural Machine Translation

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00330 ◽

2020 ◽

Vol 8 ◽

pp. 539-555

Author(s):

Marina Fomicheva ◽

Shuo Sun ◽

Lisa Yankovskaya ◽

Frédéric Blain ◽

Francisco Guzmán ◽

...

Keyword(s):

Machine Translation ◽

Real World ◽

State Of The Art ◽

Black Box ◽

Test Time ◽

Quality Estimation ◽

Neural Machine Translation ◽

Real World Applications ◽

Unsupervised Approach

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation, and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By utilizing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivaling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.

Download Full-text

A Variable Impacts Measurement in Random Forest for Mobile Cloud Computing

Wireless Communications and Mobile Computing ◽

10.1155/2017/6817627 ◽

2017 ◽

Vol 2017 ◽

pp. 1-13 ◽

Cited By ~ 7

Author(s):

Jae-Hee Hur ◽

Sun-Young Ihm ◽

Young-Ho Park

Keyword(s):

Cloud Computing ◽

Random Forest ◽

Shapley Value ◽

Classification Accuracy ◽

Mobile Cloud Computing ◽

Black Box ◽

Box Model ◽

Mobile Cloud ◽

Random Forest Algorithm ◽

Variable Impact

Recently, the importance of mobile cloud computing has increased. Mobile devices can collect personal data from various sensors within a shorter period of time and sensor-based data consists of valuable information from users. Advanced computation power and data analysis technology based on cloud computing provide an opportunity to classify massive sensor data into given labels. Random forest algorithm is known as black box model which is hardly able to interpret the hidden process inside. In this paper, we propose a method that analyzes the variable impact in random forest algorithm to clarify which variable affects classification accuracy the most. We apply Shapley Value with random forest to analyze the variable impact. Under the assumption that every variable cooperates as players in the cooperative game situation, Shapley Value fairly distributes the payoff of variables. Our proposed method calculates the relative contributions of the variables within its classification process. In this paper, we analyze the influence of variables and list the priority of variables that affect classification accuracy result. Our proposed method proves its suitability for data interpretation in black box model like a random forest so that the algorithm is applicable in mobile cloud computing environment.

Download Full-text

Visualizing the Feature Importance for Black Box Models

Machine Learning and Knowledge Discovery in Databases - Lecture Notes in Computer Science ◽

10.1007/978-3-030-10925-7_40 ◽

2019 ◽

pp. 655-670 ◽

Cited By ~ 5

Author(s):

Giuseppe Casalicchio ◽

Christoph Molnar ◽

Bernd Bischl

Keyword(s):

Black Box ◽

Box Models ◽

Feature Importance ◽

Black Box Models

Download Full-text

Online pairwise learning algorithms with convex loss functions

Information Sciences ◽

10.1016/j.ins.2017.04.022 ◽

2017 ◽

Vol 406-407 ◽

pp. 57-70 ◽

Cited By ~ 6

Author(s):

Junhong Lin ◽

Yunwen Lei ◽

Bo Zhang ◽

Ding-Xuan Zhou

Keyword(s):

Learning Algorithms ◽

Loss Functions ◽

Pairwise Learning ◽

Convex Loss

Download Full-text

A Time Series Forecasting with different Visualization Modes of COVID-19 Cases throughout the World

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.34958 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 505-510

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Forecasting ◽

Learning Problems ◽

Predictive Analysis ◽

Data Modelling ◽

Medical Field ◽

Processing Data ◽

The World ◽

Real World Applications

In real world applications, one of the prosperous field of science is time series forecasting due to its recognition though having some challenges in the development of methods. In medical field, time series forecasting models have been successfully used in various applications to predict progress of the disease, measure the risk dependent on time and the mortality rate. However due to the availability of many techniques which excel in each of a particular scenario, choosing an appropriate model has become challenging. When a huge dataset is considered it is obvious that machine learning is the best way to perform predictive analysis or pattern recognition tasks on the data. Before machine learning can be used, the time series forecasting problems should be reframed into supervised learning problems. The purpose of machine learning in this field is also to tackle the different challenges like data pre-processing, data modelling, training and any other refinement required with respect to the actual data. This paper deals with the predictive analysis and various visualization applications for time series forecasting of COVID- 19 patients throughout the world.

Download Full-text