Beam search pruning in speech recognition using a posterior probability-based confidence measure

In this paper, we introduce two new confidence measures for large vocabulary speech recognition systems. The major feature of these measures is that they can be computed without waiting for the end of the audio stream. We proposed two kinds of confidence measures: frame-synchronous and local. The frame-synchronous ones can be computed as soon as a frame is processed by the recognition engine and are based on a likelihood ratio. The local measures estimate a local posterior probability in the vicinity of the word to analyze. We evaluated our confidence measures within the framework of the automatic transcription of French broadcast news with the EER criterion. Our local measures achieved results very close to the best state-of-the-art measure (EER of 23% compared to 22.0%). We then conducted a preliminary experiment to assess the contribution of our confidence measure in improving the comprehension of an automatic transcription for the hearing impaired. We introduced several modalities to highlight words of low confidence in this transcription. We showed that these modalities used with our local confidence measure improved the comprehension of automatic transcription.

Download Full-text

Channel selection using neural network posterior probability for speech recognition with distributed microphone arrays in everyday environments

10.21437/chime.2018-5 ◽

2018 ◽

Cited By ~ 3

Author(s):

Feifei Xiong ◽

Jisi Zhang ◽

Bernd Meyer ◽

Heidi Christensen ◽

Jon Barker

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Posterior Probability ◽

Channel Selection ◽

Microphone Arrays

Download Full-text

Fast confidence measure algorithm for continuous speech recognition

10.21437/interspeech.2005-516 ◽

2005 ◽

Author(s):

Bin Dong ◽

Qingwei Zhao ◽

Yonghong Yan

Keyword(s):

Speech Recognition ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Confidence Measure

Download Full-text

Improved speech recognition word lattice translation by confidence measure

10.21437/interspeech.2005-731 ◽

2005 ◽

Author(s):

Abdulvohid Bozarov ◽

Yoshinori Sagisaka ◽

Ruiqiang Zhang ◽

Genichiro Kikui

Keyword(s):

Speech Recognition ◽

Confidence Measure ◽

Word Lattice

Download Full-text

Improvements of search error risk minimization in viterbi beam search for speech recognition

10.21437/interspeech.2010-101 ◽

2010 ◽

Author(s):

Takaaki Hori ◽

Shinji Watanabe ◽

Atsushi Nakamura

Keyword(s):

Speech Recognition ◽

Beam Search ◽

Risk Minimization ◽

Error Risk ◽

Viterbi Beam Search

Download Full-text

Maximum confidence measure based interaural phase difference estimation for noise masking in dual-microphone robust speech recognition

10.21437/interspeech.2011-199 ◽

2011 ◽

Author(s):

Hsien-Cheng Liao ◽

Yuan-Fu Liao ◽

Chin-Hui Lee

Keyword(s):

Speech Recognition ◽

Phase Difference ◽

Robust Speech Recognition ◽

Confidence Measure ◽

Noise Masking ◽

Interaural Phase ◽

Interaural Phase Difference

Download Full-text

A confusion network based confidence measure for active learning in speech recognition

2008 International Conference on Natural Language Processing and Knowledge Engineering ◽

10.1109/nlpke.2008.4906813 ◽

2008 ◽

Cited By ~ 3

Author(s):

Wei Chen ◽

Gang Liu ◽

Jun Guo

Keyword(s):

Speech Recognition ◽

Active Learning ◽

Confidence Measure ◽

Confusion Network

Download Full-text

A unified alignment algorithm for bilingual data

Natural Language Engineering ◽

10.1017/s135132491100026x ◽

2011 ◽

Vol 19 (1) ◽

pp. 33-60 ◽

Cited By ~ 3

Author(s):

CHRISTOPH TILLMANN ◽

SANJIKA HEWAVITHARANA

Keyword(s):

Dynamic Programming ◽

Speech Recognition ◽

Search Algorithm ◽

Alignment Algorithm ◽

Comparable Data ◽

Beam Search ◽

Sentence Level ◽

Unified Algorithm ◽

The One ◽

Document Level

AbstractThe paper presents a novel unified algorithm for aligning sentences with their translations in bilingual data. With the help of ideas from a stack-based dynamic programming decoder for speech recognition (Ney 1984), the search is parametrized in a novel way such that the unified algorithm can be used on various types of data that have been previously handled by separate implementations: the extracted text chunk pairs can be either sub-sentential pairs, one-to-one, or many-to-many sentence-level pairs. The one-stage search algorithm is carried out in a single run over the data. Its memory requirements are independent of the length of the source document, and it is applicable to sentence-level parallel as well as comparable data. With the help of a unified beam-search candidate pruning, the algorithm is very efficient: it avoids any document-level pre-filtering and uses less restrictive sentence-level filtering. Results are presented on a Russian–English, a Spanish–English, and an Arabic–English extraction task. Based on simple word-based scoring features, text chunk pairs are extracted out of several trillion candidates, where the search is carried out on 300 processors in parallel.

Download Full-text