Wasserstein-Distance-Based Multi-Source Adversarial Domain Adaptation for Emotion Recognition and Vigilance Estimation

Speech emotion recognition (SER) is a natural method of recognizing individual emotions in everyday life. To distribute SER models to real-world applications, some key challenges must be overcome, such as the lack of datasets tagged with emotion labels and the weak generalization of the SER model for an unseen target domain. This study proposes a multi-path and group-loss-based network (MPGLN) for SER to support multi-domain adaptation. The proposed model includes a bidirectional long short-term memory-based temporal feature generator and a transferred feature extractor from the pre-trained VGG-like audio classification model (VGGish), and it learns simultaneously based on multiple losses according to the association of emotion labels in the discrete and dimensional models. For the evaluation of the MPGLN SER as applied to multi-cultural domain datasets, the Korean Emotional Speech Database (KESD), including KESDy18 and KESDy19, is constructed, and the English-speaking Interactive Emotional Dyadic Motion Capture database (IEMOCAP) is used. The evaluation of multi-domain adaptation and domain generalization showed 3.7% and 3.5% improvements, respectively, of the F1 score when comparing the performance of MPGLN SER with a baseline SER model that uses a temporal feature generator. We show that the MPGLN SER efficiently supports multi-domain adaptation and reinforces model generalization.

Download Full-text

Spatial-aware Network using Wasserstein Distance for Unsupervised Domain Adaptation

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9326987 ◽

2020 ◽

Author(s):

Liu Long ◽

Luo Bin ◽

Fan Jiang

Keyword(s):

Domain Adaptation ◽

Wasserstein Distance ◽

Unsupervised Domain Adaptation

Download Full-text

Cross-corpus Speech Emotion Recognition based on Few-shot Learning and Domain Adaptation

IEEE Signal Processing Letters ◽

10.1109/lsp.2021.3086395 ◽

2021 ◽

pp. 1-1

Author(s):

Youngdo Ahn ◽

Sung Joo Lee ◽

Jong Won Shin

Keyword(s):

Emotion Recognition ◽

Domain Adaptation ◽

Speech Emotion Recognition

Download Full-text

Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition

IEEE Signal Processing Letters ◽

10.1109/lsp.2017.2672753 ◽

2017 ◽

Vol 24 (4) ◽

pp. 500-504 ◽

Cited By ~ 37

Author(s):

Jun Deng ◽

Xinzhou Xu ◽

Zixing Zhang ◽

Sascha Fruhholz ◽

Bjorn Schuller

Keyword(s):

Emotion Recognition ◽

Domain Adaptation ◽

Speech Emotion Recognition

Download Full-text

Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in Audio-Visual Emotion Recognition

10.21437/interspeech.2021-666 ◽

2021 ◽

Author(s):

Haoqi Li ◽

Yelin Kim ◽

Cheng-Hao Kuo ◽

Shrikanth S. Narayanan

Keyword(s):

Emotion Recognition ◽

Domain Adaptation

Download Full-text

Wasserstein Distance-Based Domain Adaptation and Its Application to Road Segmentation

10.1109/ijcnn52387.2021.9534121 ◽

2021 ◽

Author(s):

Seita Kono ◽

Takaya Ueda ◽

Enrique Arriaga-Varela ◽

Ikuko Nishikawa

Keyword(s):

Domain Adaptation ◽

Wasserstein Distance ◽

Road Segmentation

Download Full-text

Wasserstein distance based Asymmetric Adversarial Domain Adaptation in intelligent bearing fault diagnosis

Measurement Science and Technology ◽

10.1088/1361-6501/ac0a0c ◽

2021 ◽

Author(s):

Yu Ying ◽

Zhao Jun ◽

Tang Tang ◽

Wang Jingwei ◽

Chen Ming ◽

...

Keyword(s):

Fault Diagnosis ◽

Domain Adaptation ◽

Wasserstein Distance ◽

Bearing Fault ◽

Bearing Fault Diagnosis

Download Full-text

Optimal Transport with Dimensionality Reduction for Domain Adaptation

Symmetry ◽

10.3390/sym12121994 ◽

2020 ◽

Vol 12 (12) ◽

pp. 1994

Author(s):

Ping Li ◽

Zhiwei Ni ◽

Xuhui Zhu ◽

Juan Song ◽

Wenying Wu

Keyword(s):

Dimensionality Reduction ◽

Optimal Transport ◽

Domain Adaptation ◽

Wasserstein Distance ◽

Local Information ◽

Target Domain ◽

Source Domain ◽

Second Stage ◽

Cross Domain ◽

Feature Based

Domain adaptation manages to learn a robust classifier for target domain, using the source domain, but they often follow different distributions. To bridge distribution shift between the two domains, most of previous works aim to align their feature distributions through feature transformation, of which optimal transport for domain adaptation has attract researchers’ interest, as it can exploit the local information of the two domains in the process of mapping the source instances to the target ones by minimizing Wasserstein distance between their feature distributions. However, it may weaken the feature discriminability of source domain, thus degrade domain adaptation performance. To address this problem, this paper proposes a two-stage feature-based adaptation approach, referred to as optimal transport with dimensionality reduction (OTDR). In the first stage, we apply the dimensionality reduction with intradomain variant maximization but source intraclass compactness minimization, to separate data samples as much as possible and enhance the feature discriminability of the source domain. In the second stage, we leverage optimal transport-based technique to preserve the local information of the two domains. Notably, the desirable properties in the first stage can mitigate the degradation of feature discriminability of the source domain in the second stage. Extensive experiments on several cross-domain image datasets validate that OTDR is superior to its competitors in classification accuracy.

Download Full-text

Multi-Source Distilling Domain Adaptation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6997 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12975-12983

Author(s):

Sicheng Zhao ◽

Guangzhi Wang ◽

Shanghang Zhang ◽

Yang Gu ◽

Yaxian Li ◽

...

Keyword(s):

Domain Adaptation ◽

Feature Space ◽

Wasserstein Distance ◽

Training Data ◽

Source Distribution ◽

Single Source ◽

Multiple Sources ◽

Target Domain ◽

Target Feature ◽

Training Samples

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA). Conventional DA methods usually assume that the labeled data is sampled from a single source distribution. However, in practice, labeled data may be collected from multiple sources, while naive application of the single-source DA algorithms may lead to suboptimal solutions. In this paper, we propose a novel multi-source distilling domain adaptation (MDDA) network, which not only considers the different distances among multiple sources and the target, but also investigates the different similarities of the source samples to the target ones. Specifically, the proposed MDDA includes four stages: (1) pre-train the source classifiers separately using the training data from each source; (2) adversarially map the target into the feature space of each source respectively by minimizing the empirical Wasserstein distance between source and target; (3) select the source training samples that are closer to the target to fine-tune the source classifiers; and (4) classify each encoded target feature by corresponding source classifier, and aggregate different predictions using respective domain weight, which corresponds to the discrepancy between each source and target. Extensive experiments are conducted on public DA benchmarks, and the results demonstrate that the proposed MDDA significantly outperforms the state-of-the-art approaches. Our source code is released at: https://github.com/daoyuan98/MDDA.

Download Full-text

Domain Adaptation for EEG Emotion Recognition Based on Latent Representation Similarity

IEEE Transactions on Cognitive and Developmental Systems ◽

10.1109/tcds.2019.2949306 ◽

2020 ◽

Vol 12 (2) ◽

pp. 344-353 ◽

Cited By ~ 6

Author(s):

Jinpeng Li ◽

Shuang Qiu ◽

Changde Du ◽

Yixin Wang ◽

Huiguang He

Keyword(s):

Emotion Recognition ◽

Domain Adaptation

Download Full-text