scholarly journals A Supervised Learning Method for Improving the Generalization of Speaker Verification Systems by Learning Metrics from a Mean Teacher

2021 ◽  
Vol 12 (1) ◽  
pp. 76
Author(s):  
Ju-Ho Kim ◽  
Hye-Jin Shim ◽  
Jee-Weon Jung ◽  
Ha-Jin Yu

The majority of recent speaker verification tasks are studied under open-set evaluation scenarios considering real-world conditions. The characteristics of these tasks imply that the generalization towards unseen speakers is a critical capability. Thus, this study aims to improve the generalization of the system for the performance enhancement of speaker verification. To achieve this goal, we propose a novel supervised-learning-method-based speaker verification system using the mean teacher framework. The mean teacher network refers to the temporal averaging of deep neural network parameters, which can produce a more accurate, stable representations than fixed weights at the end of training and is conventionally used for semi-supervised learning. Leveraging the success of the mean teacher framework in many studies, the proposed supervised learning method exploits the mean teacher network as an auxiliary model for better training of the main model, the student network. By learning the reliable intermediate representations derived from the mean teacher network as well as one-hot speaker labels, the student network is encouraged to explore more discriminative embedding spaces. The experimental results demonstrate that the proposed method relatively reduces the equal error rate by 11.61%, compared to the baseline system.

Author(s):  
Eko Widoyo Putro ◽  
Berlin Sibarani

This study is aimed at improving the second grade of students’ speakingachievement by using Community Language Learning (CLL) Method. Theresearch was conducted by applying classroom action research. The subject of this study was second grade of Private Senior High School (Sekolah Menengah Atas Swasta) of Dwi Tunggal Tanjung Morawa which consisted of 31 students. To collect the data, the instruments used were primary data (SpeakingTest) and secondary data (interview sheet, observation sheet, field notes). It can be seen from the score in test I, test II and test III. In the Test I, the mean of the students’score was (64.77), in the Test II was (71.35), and the mean of the students’ score of the Test III was (80.90). Based on the interview, and observation sheet, it shows that the expression and excitement of the students got improved as well. It was found that teaching of speaking by using Community Language Learningcould significantly improve students’ speaking achievement.Key Words: Community Language Learning, Method, Improvement, Speaking Achievement


2020 ◽  
Author(s):  
Galina Lavrentyeva ◽  
Marina Volkova ◽  
Anastasia Avdeeva ◽  
Sergey Novoselov ◽  
Artem Gorlanov ◽  
...  

2020 ◽  
Author(s):  
Xu Zheng ◽  
Yan Song ◽  
Jie Yan ◽  
Li-Rong Dai ◽  
Ian McLoughlin ◽  
...  

Author(s):  
Khamis A. Al-Karawi

Background & Objective: Speaker Recognition (SR) techniques have been developed into a relatively mature status over the past few decades through development work. Existing methods typically use robust features extracted from clean speech signals, and therefore in idealized conditions can achieve very high recognition accuracy. For critical applications, such as security and forensics, robustness and reliability of the system are crucial. Methods: The background noise and reverberation as often occur in many real-world applications are known to compromise recognition performance. To improve the performance of speaker verification systems, an effective and robust technique is proposed to extract features for speech processing, capable of operating in the clean and noisy condition. Mel Frequency Cepstrum Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GFCC) are the mature techniques and the most common features, which are used for speaker recognition. MFCCs are calculated from the log energies in frequency bands distributed over a mel scale. While GFCC has been acquired from a bank of Gammatone filters, which was originally suggested to model human cochlear filtering. This paper investigates the performance of GFCC and the conventional MFCC feature in clean and noisy conditions. The effects of the Signal-to-Noise Ratio (SNR) and language mismatch on the system performance have been taken into account in this work. Conclusion: Experimental results have shown significant improvement in system performance in terms of reduced equal error rate and detection error trade-off. Performance in terms of recognition rates under various types of noise, various Signal-to-Noise Ratios (SNRs) was quantified via simulation. Results of the study are also presented and discussed.


Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 403
Author(s):  
Xun Zhang ◽  
Lanyan Yang ◽  
Bin Zhang ◽  
Ying Liu ◽  
Dong Jiang ◽  
...  

The problem of extracting meaningful data through graph analysis spans a range of different fields, such as social networks, knowledge graphs, citation networks, the World Wide Web, and so on. As increasingly structured data become available, the importance of being able to effectively mine and learn from such data continues to grow. In this paper, we propose the multi-scale aggregation graph neural network based on feature similarity (MAGN), a novel graph neural network defined in the vertex domain. Our model provides a simple and general semi-supervised learning method for graph-structured data, in which only a very small part of the data is labeled as the training set. We first construct a similarity matrix by calculating the similarity of original features between all adjacent node pairs, and then generate a set of feature extractors utilizing the similarity matrix to perform multi-scale feature propagation on graphs. The output of multi-scale feature propagation is finally aggregated by using the mean-pooling operation. Our method aims to improve the model representation ability via multi-scale neighborhood aggregation based on feature similarity. Extensive experimental evaluation on various open benchmarks shows the competitive performance of our method compared to a variety of popular architectures.


Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 779
Author(s):  
Ruriko Yoshida

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.


Sign in / Sign up

Export Citation Format

Share Document