Beam search pruning in speech recognition using a posterior probability-based confidence measure

2004 ◽  
Vol 42 (3-4) ◽  
pp. 409-428 ◽  
Author(s):  
Sherif Abdou ◽  
Michael S. Scordilis
Author(s):  
JOSEPH RAZIK ◽  
ODILE MELLA ◽  
DOMINIQUE FOHR ◽  
JEAN-PAUL HATON

In this paper, we introduce two new confidence measures for large vocabulary speech recognition systems. The major feature of these measures is that they can be computed without waiting for the end of the audio stream. We proposed two kinds of confidence measures: frame-synchronous and local. The frame-synchronous ones can be computed as soon as a frame is processed by the recognition engine and are based on a likelihood ratio. The local measures estimate a local posterior probability in the vicinity of the word to analyze. We evaluated our confidence measures within the framework of the automatic transcription of French broadcast news with the EER criterion. Our local measures achieved results very close to the best state-of-the-art measure (EER of 23% compared to 22.0%). We then conducted a preliminary experiment to assess the contribution of our confidence measure in improving the comprehension of an automatic transcription for the hearing impaired. We introduced several modalities to highlight words of low confidence in this transcription. We showed that these modalities used with our local confidence measure improved the comprehension of automatic transcription.


2005 ◽  
Author(s):  
Abdulvohid Bozarov ◽  
Yoshinori Sagisaka ◽  
Ruiqiang Zhang ◽  
Genichiro Kikui

2011 ◽  
Vol 19 (1) ◽  
pp. 33-60 ◽  
Author(s):  
CHRISTOPH TILLMANN ◽  
SANJIKA HEWAVITHARANA

AbstractThe paper presents a novel unified algorithm for aligning sentences with their translations in bilingual data. With the help of ideas from a stack-based dynamic programming decoder for speech recognition (Ney 1984), the search is parametrized in a novel way such that the unified algorithm can be used on various types of data that have been previously handled by separate implementations: the extracted text chunk pairs can be either sub-sentential pairs, one-to-one, or many-to-many sentence-level pairs. The one-stage search algorithm is carried out in a single run over the data. Its memory requirements are independent of the length of the source document, and it is applicable to sentence-level parallel as well as comparable data. With the help of a unified beam-search candidate pruning, the algorithm is very efficient: it avoids any document-level pre-filtering and uses less restrictive sentence-level filtering. Results are presented on a Russian–English, a Spanish–English, and an Arabic–English extraction task. Based on simple word-based scoring features, text chunk pairs are extracted out of several trillion candidates, where the search is carried out on 300 processors in parallel.


Sign in / Sign up

Export Citation Format

Share Document