Explanations for Data Repair Through Shapley Values

2021 ◽  
Author(s):  
Daniel Deutch ◽  
Nave Frost ◽  
Amir Gilad ◽  
Oren Sheffer
Keyword(s):  
2021 ◽  
Vol 40 (1) ◽  
pp. 235-250
Author(s):  
Liuxin Chen ◽  
Nanfang Luo ◽  
Xiaoling Gou

In the real multi-criteria group decision making (MCGDM) problems, there will be an interactive relationship among different decision makers (DMs). To identify the overall influence, we define the Shapley value as the DM’s weight. Entropy is a measure which makes it better than similarity measures to recognize a group decision making problem. Since we propose a relative entropy to measure the difference between two systems, which improves the accuracy of the distance measure.In this paper, a MCGDM approach named as TODIM is presented under q-rung orthopair fuzzy information.The proposed TODIM approach is developed for correlative MCGDM problems, in which the weights of the DMs are calculated in terms of Shapley values and the dominance matrices are evaluated based on relative entropy measure with q-rung orthopair fuzzy information.Furthermore, the efficacy of the proposed Gq-ROFWA operator and the novel TODIM is demonstrated through a selection problem of modern enterprises risk investment. A comparative analysis with existing methods is presented to validate the efficiency of the approach.


2012 ◽  
Vol 7 (2) ◽  
pp. 169-180 ◽  
Author(s):  
Victor Ginsburgh ◽  
Israël Zang

AbstractWe suggest a new game-theory-based ranking method for wines, in which the Shapley Value of each wine is computed, and wines are ranked according to their Shapley Values. Judges should find it simpler to use, since they are not required to rank order or grade all the wines, but merely to choose the group of those that they find meritorious. Our ranking method is based on the set of reasonable axioms that determine the Shapley Value as the unique solution of an underlying cooperative game. Unlike in the general case, where computing the Shapley Value could be complex, here the Shapley Value and hence the final ranking, are straightforward to compute. (JEL Classification: C71, D71, D78)


2021 ◽  
Author(s):  
Louise Bloch ◽  
Christoph M. Friedrich

Abstract Background: The prediction of whether Mild Cognitive Impaired (MCI) subjects will prospectively develop Alzheimer's Disease (AD) is important for the recruitment and monitoring of subjects for therapy studies. Machine Learning (ML) is suitable to improve early AD prediction. The etiology of AD is heterogeneous, which leads to noisy data sets. Additional noise is introduced by multicentric study designs and varying acquisition protocols. This article examines whether an automatic and fair data valuation method based on Shapley values can identify subjects with noisy data. Methods: An ML-workow was developed and trained for a subset of the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. The validation was executed for an independent ADNI test data set and for the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) cohort. The workow included volumetric Magnetic Resonance Imaging (MRI) feature extraction, subject sample selection using data Shapley, Random Forest (RF) and eXtreme Gradient Boosting (XGBoost) for model training and Kernel SHapley Additive exPlanations (SHAP) values for model interpretation. This model interpretation enables clinically relevant explanation of individual predictions. Results: The XGBoost models which excluded 116 of the 467 subjects from the training data set based on their Logistic Regression (LR) data Shapley values outperformed the models which were trained on the entire training data set and which reached a mean classification accuracy of 58.54 % by 14.13 % (8.27 percentage points) on the independent ADNI test data set. The XGBoost models, which were trained on the entire training data set reached a mean accuracy of 60.35 % for the AIBL data set. An improvement of 24.86 % (15.00 percentage points) could be reached for the XGBoost models if those 72 subjects with the smallest RF data Shapley values were excluded from the training data set. Conclusion: The data Shapley method was able to improve the classification accuracies for the test data sets. Noisy data was associated with the number of ApoEϵ4 alleles and volumetric MRI measurements. Kernel SHAP showed that the black-box models learned biologically plausible associations.


2021 ◽  
Vol 4 ◽  
Author(s):  
Mustafa Y. Topaloglu ◽  
Elisabeth M. Morrell ◽  
Suraj Rajendran ◽  
Umit Topaloglu

Artificial Intelligence and its subdomain, Machine Learning (ML), have shown the potential to make an unprecedented impact in healthcare. Federated Learning (FL) has been introduced to alleviate some of the limitations of ML, particularly the capability to train on larger datasets for improved performance, which is usually cumbersome for an inter-institutional collaboration due to existing patient protection laws and regulations. Moreover, FL may also play a crucial role in circumventing ML’s exigent bias problem by accessing underrepresented groups’ data spanning geographically distributed locations. In this paper, we have discussed three FL challenges, namely: privacy of the model exchange, ethical perspectives, and legal considerations. Lastly, we have proposed a model that could aide in assessing data contributions of a FL implementation. In light of the expediency and adaptability of using the Sørensen–Dice Coefficient over the more limited (e.g., horizontal FL) and computationally expensive Shapley Values, we sought to demonstrate a new paradigm that we hope, will become invaluable for sharing any profit and responsibilities that may accompany a FL endeavor.


2021 ◽  
Author(s):  
Takayuki Scmitsu ◽  
Mitsuki Nakamura ◽  
Shotaro Ishigami ◽  
Toru Aoki ◽  
Teng-Yok Lee ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document