A study of large vocabulary speech recognition decoding using finite-state graphs

Finitely subsequential transducers are efficient finite-state transducers with a finite number of final outputs and are used in a variety of applications. Not all transducers admit equivalent finitely subsequential transducers however. We briefly describe an existing generalized determinization algorithm for finitely subsequential transducers and give the first characterization of finitely subsequentiable transducers, transducers that admit equivalent finitely subsequential transducers. Our characterization shows the existence of an efficient algorithm for testing finite subsequentiability. We have fully implemented the generalized determinization algorithm and the algorithm for testing finite subsequentiability. We report experimental results showing that these algorithms are practical in large-vocabulary speech recognition applications. The theoretical formulation of our results is the equivalence of the following three properties for finite-state transducers: determinizability in the sense of the generalized algorithm, finite subsequentiability, and the twins property.

Download Full-text

A synchronized pruning composition algorithm of weighted finite state transducers for large vocabulary speech recognition

2012 8th International Symposium on Chinese Spoken Language Processing ◽

10.1109/iscslp.2012.6423474 ◽

2012 ◽

Author(s):

Zhiyang He ◽

Ping Lv ◽

Wei Li ◽

Ji Wu

Keyword(s):

Speech Recognition ◽

Large Vocabulary ◽

Finite State Transducers ◽

Finite State ◽

Large Vocabulary Speech Recognition ◽

Weighted Finite State Transducers

Download Full-text

A Generalized Dynamic Composition Algorithm of Weighted Finite State Transducers for Large Vocabulary Speech Recognition

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 ◽

10.1109/icassp.2007.366920 ◽

2007 ◽

Cited By ~ 9

Author(s):

Octavian Cheng ◽

John Dines ◽

Mathew Magimai Doss

Keyword(s):

Speech Recognition ◽

Large Vocabulary ◽

Dynamic Composition ◽

Finite State Transducers ◽

Finite State ◽

Large Vocabulary Speech Recognition ◽

Weighted Finite State Transducers

Download Full-text

FPGA Implementation of a Pipelined Gaussian Calculation for HMM-Based Large Vocabulary Speech Recognition

International Journal of Reconfigurable Computing ◽

10.1155/2011/697080 ◽

2011 ◽

Vol 2011 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Richard Veitch ◽

Louis-Marie Aubert ◽

Roger Woods ◽

Scott Fischaber

Keyword(s):

Speech Recognition ◽

Markov Models ◽

Recognition System ◽

Acoustic Modeling ◽

Hardware Accelerator ◽

Clock Frequency ◽

Large Vocabulary ◽

Finite State ◽

Finite State Transducer ◽

Large Vocabulary Speech Recognition

A scalable large vocabulary, speaker independent speech recognition system is being developed using Hidden Markov Models (HMMs) for acoustic modeling and a Weighted Finite State Transducer (WFST) to compile sentence, word, and phoneme models. The system comprises a software backend search and an FPGA-based Gaussian calculation which are covered here. In this paper, we present an efficient pipelined design implemented both as an embedded peripheral and as a scalable, parallel hardware accelerator. Both architectures have been implemented on an Alpha Data XRC-5T1, reconfigurable computer housing a Virtex 5 SX95T FPGA. The core has been tested and is capable of calculating a full set of Gaussian results from 3825 acoustic models in 9.03 ms which coupled with a backend search of 5000 words has provided an accuracy of over 80%. Parallel implementations have been designed with up to 32 cores and have been successfully implemented with a clock frequency of 133 MHz.

Download Full-text