hierarchical mixture of experts
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 5)

H-INDEX

6
(FIVE YEARS 1)

Author(s):  
Mengqi Xue ◽  
Jie Song ◽  
Xinchao Wang ◽  
Ying Chen ◽  
Xingen Wang ◽  
...  

Knowledge distillation (KD) has recently emerged as an efficacious scheme for learning compact deep neural networks (DNNs). Despite the promising results achieved, the rationale that interprets the behavior of KD has yet remained largely understudied. In this paper, we introduce a novel task-oriented attention model, termed as KDExplainer, to shed light on the working mechanism underlying the vanilla KD. At the heart of KDExplainer is a Hierarchical Mixture of Experts (HME), in which a multi-class classification is reformulated as a multi-task binary one. Through distilling knowledge from a free-form pre-trained DNN to KDExplainer, we observe that KD implicitly modulates the knowledge conflicts between different subtasks, and in reality has much more to offer than label smoothing. Based on such findings, we further introduce a portable tool, dubbed as virtual attention module (VAM), that can be seamlessly integrated with various DNNs to enhance their performance under KD. Experimental results demonstrate that with a negligible additional cost, student models equipped with VAM consistently outperform their non-VAM counterparts across different benchmarks. Furthermore, when combined with other KD methods, VAM remains competent in promoting results, even though it is only motivated by vanilla KD. The code is available at https:// github.com/zju-vipa/KDExplainer.


2021 ◽  
Vol 419 ◽  
pp. 148-156 ◽  
Author(s):  
Ozan İrsoy ◽  
Ethem Alpaydın

2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Yuma Iwasaki ◽  
Ryohto Sawada ◽  
Valentin Stanev ◽  
Masahiko Ishida ◽  
Akihiro Kirihara ◽  
...  

Abstract Machine learning is becoming a valuable tool for scientific discovery. Particularly attractive is the application of machine learning methods to the field of materials development, which enables innovations by discovering new and better functional materials. To apply machine learning to actual materials development, close collaboration between scientists and machine learning tools is necessary. However, such collaboration has been so far impeded by the black box nature of many machine learning algorithms. It is often difficult for scientists to interpret the data-driven models from the viewpoint of material science and physics. Here, we demonstrate the development of spin-driven thermoelectric materials with anomalous Nernst effect by using an interpretable machine learning method called factorized asymptotic Bayesian inference hierarchical mixture of experts (FAB/HMEs). Based on prior knowledge of material science and physics, we were able to extract from the interpretable machine learning some surprising correlations and new knowledge about spin-driven thermoelectric materials. Guided by this, we carried out an actual material synthesis that led to the identification of a novel spin-driven thermoelectric material. This material shows the largest thermopower to date.


Sign in / Sign up

Export Citation Format

Share Document