Nonparametric Speaker Recognition Method Using Earth Mover's Distance

Abstract With rapid developments in techniques related to the internet of things, smart service applications such as voice-command-based speech recognition and smart care applications such as context-aware-based emotion recognition will gain much attention and potentially be a requirement in smart home or office environments. In such intelligence applications, identity recognition of the specific member in indoor spaces will be a crucial issue. In this study, a combined audio-visual identity recognition approach was developed. In this approach, visual information obtained from face detection was incorporated into acoustic Gaussian likelihood calculations for constructing speaker classification trees to significantly enhance the Gaussian mixture model (GMM)-based speaker recognition method. This study considered the privacy of the monitored person and reduced the degree of surveillance. Moreover, the popular Kinect sensor device containing a microphone array was adopted to obtain acoustic voice data from the person. The proposed audio-visual identity recognition approach deploys only two cameras in a specific indoor space for conveniently performing face detection and quickly determining the total number of people in the specific space. Such information pertaining to the number of people in the indoor space obtained using face detection was utilized to effectively regulate the accurate GMM speaker classification tree design. Two face-detection-regulated speaker classification tree schemes are presented for the GMM speaker recognition method in this study—the binary speaker classification tree (GMM-BT) and the non-binary speaker classification tree (GMM-NBT). The proposed GMM-BT and GMM-NBT methods achieve excellent identity recognition rates of 84.28% and 83%, respectively; both values are higher than the rate of the conventional GMM approach (80.5%). Moreover, as the extremely complex calculations of face recognition in general audio-visual speaker recognition tasks are not required, the proposed approach is rapid and efficient with only a slight increment of 0.051 s in the average recognition time.

Download Full-text

The Earth Mover’s Distance as a Metric for the Space of Inorganic Compositions

10.26434/chemrxiv.12777566.v1 ◽

2020 ◽

Author(s):

Cameron Hargreaves ◽

Matthew Dyer ◽

Michael Gaultois ◽

Vitaliy Kurlin ◽

Matthew J Rosseinsky

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Inorganic Crystal Structure Database ◽

Earth Mover’S Distance ◽

Chemical Similarity ◽

Earth Mover's Distance ◽

Neighbor Search ◽

The Earth ◽

Binary Compounds

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.

Download Full-text

Local earth mover's distance and face warping [multimedia object distance measure]

2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763) ◽

10.1109/icme.2004.1394443 ◽

2005 ◽

Author(s):

S.H. Srinivasan

Keyword(s):

Distance Measure ◽

Earth Mover’S Distance ◽

Earth Mover's Distance ◽

Object Distance ◽

Multimedia Object

Download Full-text

Adaptive Video News Story Tracking based on Earth Mover'S Distance

2006 IEEE International Conference on Multimedia and Expo ◽

10.1109/icme.2006.262709 ◽

2006 ◽

Cited By ~ 2

Author(s):

Mats Uddenfeldt ◽

Keiichiro Hoashi ◽

Kazunori Matsumoto ◽

Fumiaki Sugaya

Keyword(s):

News Story ◽

Earth Mover’S Distance ◽

Earth Mover's Distance ◽

Adaptive Video

Download Full-text

The Earth Mover's Distance under transformation sets

Proceedings of the Seventh IEEE International Conference on Computer Vision ◽

10.1109/iccv.1999.790393 ◽

1999 ◽

Cited By ~ 53

Author(s):

S. Cohen ◽

L. Guibasm

Keyword(s):

Earth Mover’S Distance ◽

Earth Mover's Distance ◽

The Earth

Download Full-text

Template selection based superpixel earth mover's distance algorithm for hand gesture recognition

2016 IEEE 13th International Conference on Signal Processing (ICSP) ◽

10.1109/icsp.2016.7877980 ◽

2016 ◽

Cited By ~ 1

Author(s):

Yongjie Zhang ◽

Chong Wang ◽

Jieyu Zhao ◽

Li Zhang ◽

Shing-Chow Chan

Keyword(s):

Gesture Recognition ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Earth Mover’S Distance ◽

Earth Mover's Distance ◽

Template Selection

Download Full-text

A Proposed Speaker Recognition Method B Based on Long-Term Voice Features and Fuzzy Logic

Engineering and Technology Journal ◽

10.30684/etj.v39i1b.343 ◽

2021 ◽

Vol 39 (1B) ◽

pp. 1-10

Author(s):

Iman H. Hadi ◽

Alia K. Abdul-Hassan

Keyword(s):

Fuzzy Logic ◽

Speaker Recognition ◽

Recognition Accuracy ◽

Inner Product ◽

Maximum Frequency ◽

Recognition Method ◽

Data Set ◽

Zero Crossing ◽

Zero Crossing Rate

Speaker recognition depends on specific predefined steps. The most important steps are feature extraction and features matching. In addition, the category of the speaker voice features has an impact on the recognition process. The proposed speaker recognition makes use of biometric (voice) attributes to recognize the identity of the speaker. The long-term features were used such that maximum frequency, pitch and zero crossing rate (ZCR). In features matching step, the fuzzy inner product was used between feature vectors to compute the matching value between a claimed speaker voice utterance and test voice utterances. The experiments implemented using (ELSDSR) data set. These experiments showed that the recognition accuracy is 100% when using text dependent speaker recognition.

Download Full-text

Keyword Search over Web Documents Based on Earth Mover’s Distance

Web Information Systems Engineering – WISE 2014 - Lecture Notes in Computer Science ◽

10.1007/978-3-319-11749-2_20 ◽

2014 ◽

pp. 256-265

Author(s):

Jiangang Ma ◽

Quan Z. Sheng ◽

Lina Yao ◽

Yong Xu ◽

Ali Shemshadi

Keyword(s):

Keyword Search ◽

Earth Mover’S Distance ◽

Web Documents ◽

Earth Mover's Distance

Download Full-text