A study of term weighting in phonotactic approach to spoken language recognition

Mapping Intimacies ◽

10.21437/interspeech.2010-719 ◽

2010 ◽

Author(s):

Sirinoot Boonsuk ◽

Donglai Zhu ◽

Bin Ma ◽

Atiwong Suchato ◽

Proadpran Punyabukkana ◽

...

Keyword(s):

Spoken Language ◽

Language Recognition ◽

Download Full-text

Sequence Summarizing Neural Networks for Spoken Language Recognition

10.21437/interspeech.2016-764 ◽

2016 ◽

Author(s):

Jan Pešán ◽

Lukáš Burget ◽

Jan Černocký

Keyword(s):

Neural Networks ◽

Spoken Language ◽

Language Recognition

Download Full-text

Stacked Long-Term TDNN for Spoken Language Recognition

10.21437/interspeech.2016-1334 ◽

2016 ◽

Author(s):

Daniel Garcia-Romero ◽

Alan McCree

Keyword(s):

Spoken Language ◽

Language Recognition

Download Full-text

Spoken Language Recognition With Prosodic Features

IEEE Transactions on Audio Speech and Language Processing ◽

10.1109/tasl.2013.2260157 ◽

2013 ◽

Vol 21 (9) ◽

pp. 1841-1853 ◽

Author(s):

Raymond W. M. Ng ◽

Tan Lee ◽

Cheung-Chi Leung ◽

Bin Ma ◽

Haizhou Li

Keyword(s):

Spoken Language ◽

Prosodic Features ◽

Language Recognition

Download Full-text

Soft margin estimation of Gaussian mixture model parameters for spoken language recognition

2010 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2010.5495079 ◽

2010 ◽

Author(s):

Donglai Zhu ◽

Bin Ma ◽

Haizhou Li

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Gaussian Mixture ◽

Spoken Language ◽

Model Parameters ◽

Language Recognition ◽

Download Full-text

Optimizing the Performance of Spoken Language Recognition With Discriminative Training

IEEE Transactions on Audio Speech and Language Processing ◽

10.1109/tasl.2008.2005319 ◽

2008 ◽

Vol 16 (8) ◽

pp. 1642-1653 ◽

Author(s):

Donglai Zhu ◽

Haizhou Li ◽

Bin Ma ◽

Chin-Hui Lee

Keyword(s):

Spoken Language ◽

Discriminative Training ◽

Language Recognition

Download Full-text

Dimensionality reduction of phone log-likelihood ratio features for spoken language recognition

10.21437/interspeech.2013-39 ◽

2013 ◽

Author(s):

Mireia Diez ◽

Amparo Varona ◽

Mikel Penagarikano ◽

Luis Javier Rodríguez-Fuentes ◽

Germán Bordel

Keyword(s):

Dimensionality Reduction ◽

Likelihood Ratio ◽

Spoken Language ◽

Language Recognition ◽

Log Likelihood ◽

Log Likelihood Ratio

Download Full-text

Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content

Information ◽

10.3390/info11040211 ◽

2020 ◽

Vol 11 (4) ◽

pp. 211 ◽

Author(s):

Rigas Kotsakis ◽

Maria Matsiola ◽

George Kalliris ◽

Charalampos Dimoulas

Keyword(s):

Data Augmentation ◽

Spoken Language ◽

Language Recognition ◽

Adaptive Classification ◽

Audio Recordings ◽

Generic Language ◽

Media Monitoring ◽

Language Detection ◽

Specific Radio ◽

Language Classification

The current paper focuses on the investigation of spoken-language classification in audio broadcasting content. The approach reflects a real-word scenario, encountered in modern media/monitoring organizations, where semi-automated indexing/documentation is deployed, which could be facilitated by the proposed language detection preprocessing. Multilingual audio recordings of specific radio streams are formed into a small dataset, which is used for the adaptive classification experiments, without seeking—at this step—for a generic language recognition model. Specifically, hierarchical discrimination schemes are followed to separate voice signals before classifying the spoken languages. Supervised and unsupervised machine learning is utilized at various windowing configurations to test the validity of our hypothesis. Besides the analysis of the achieved recognition scores (partial and overall), late integration models are proposed for semi-automatically annotation of new audio recordings. Hence, data augmentation mechanisms are offered, aiming at gradually formulating a Generic Audio Language Classification Repository. This database constitutes a program-adaptive collection that, beside the self-indexing metadata mechanisms, could facilitate generic language classification models in the future, through state-of-art techniques like deep learning. This approach matches the investigatory inception of the project, which seeks for indicators that could be applied in a second step with a larger dataset and/or an already pre-trained model, with the purpose to deliver overall results.

Download Full-text

Improved Conditional Generative Adversarial Net Classification For Spoken Language Recognition

2018 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2018.8639522 ◽

2018 ◽

Author(s):

Xiaoxiao Miao ◽

Ian McLoughlin ◽

Shengyu Yao ◽

Yonghong Yan

Keyword(s):

Spoken Language ◽

Language Recognition

Download Full-text

Spoken language processing techniques for sign language recognition and translation

Technology and Disability ◽

10.3233/tad-2008-20207 ◽

2008 ◽

Vol 20 (2) ◽

pp. 121-133 ◽

Author(s):

Philippe Dreuw ◽

Daniel Stein ◽

Thomas Deselaers ◽

David Rybach ◽

Morteza Zahedi ◽

...

Keyword(s):

Sign Language ◽

Language Processing ◽

Spoken Language ◽

Language Recognition ◽

Sign Language Recognition ◽

Spoken Language Processing ◽

Processing Techniques

Download Full-text

Maximal Figure-of-Merit Framework to Detect Multi-Label Phonetic Features for Spoken Language Recognition

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2020.2964953 ◽

2020 ◽

Vol 28 ◽

pp. 682-695

Author(s):

Ivan Kukanov ◽

Trung Ngo Trong ◽

Ville Hautamaki ◽

Sabato Marco Siniscalchi ◽

Valerio Mario Salerno ◽

...

Keyword(s):

Figure Of Merit ◽

Spoken Language ◽

Language Recognition ◽

Phonetic Features

Download Full-text