A unified alignment algorithm for bilingual data

2011 ◽  
Vol 19 (1) ◽  
pp. 33-60 ◽  
Author(s):  
CHRISTOPH TILLMANN ◽  
SANJIKA HEWAVITHARANA

AbstractThe paper presents a novel unified algorithm for aligning sentences with their translations in bilingual data. With the help of ideas from a stack-based dynamic programming decoder for speech recognition (Ney 1984), the search is parametrized in a novel way such that the unified algorithm can be used on various types of data that have been previously handled by separate implementations: the extracted text chunk pairs can be either sub-sentential pairs, one-to-one, or many-to-many sentence-level pairs. The one-stage search algorithm is carried out in a single run over the data. Its memory requirements are independent of the length of the source document, and it is applicable to sentence-level parallel as well as comparable data. With the help of a unified beam-search candidate pruning, the algorithm is very efficient: it avoids any document-level pre-filtering and uses less restrictive sentence-level filtering. Results are presented on a Russian–English, a Spanish–English, and an Arabic–English extraction task. Based on simple word-based scoring features, text chunk pairs are extracted out of several trillion candidates, where the search is carried out on 300 processors in parallel.

2010 ◽  
Vol 7 (1) ◽  
pp. 75-84 ◽  
Author(s):  
Ye Tao ◽  
Li Xueqing ◽  
Wu Bian

This paper presents a novel alignment approach for imperfect speech and the corresponding transcription. The algorithm gets started with multi-stage sentence boundary detection in audio, followed by a dynamic programming based search, to find the optimal alignment and detect the mismatches at sentence level. Experiments show promising performance, compared to the traditional forced alignment approach. The proposed algorithm has already been applied in preparing multimedia content for an online English training platform.


2003 ◽  
Vol 29 (1) ◽  
pp. 97-133 ◽  
Author(s):  
Christoph Tillmann ◽  
Hermann Ney

In this article, we describe an efficient beam search algorithm for statistical machine translation based on dynamic programming (DP). The search algorithm uses the translation model presented in Brown et al. (1993). Starting from a DP-based solution to the traveling-salesman problem, we present a novel technique to restrict the possible word reorderings between source and target language in order to achieve an efficient search algorithm. Word reordering restrictions especially useful for the translation direction German to English are presented. The restrictions are generalized, and a set of four parameters to control the word reordering is introduced, which then can easily be adopted to new translation directions. The beam search procedure has been successfully tested on the Verbmobil task (German to English, 8,000-word vocabulary) and on the Canadian Hansards task (French to English, 100,000-word vocabulary). For the medium-sized Verbmobil task, a sentence can be translated in a few seconds, only a small number of search errors occur, and there is no performance degradation as measured by the word error criterion used in this article.


1989 ◽  
Vol 21 (8-9) ◽  
pp. 1057-1064 ◽  
Author(s):  
Vijay Joshi ◽  
Prasad Modak

Waste load allocation for rivers has been a topic of growing interest. Dynamic programming based algorithms are particularly attractive in this context and are widely reported in the literature. Codes developed for dynamic programming are however complex, require substantial computer resources and importantly do not allow interactions of the user. Further, there is always resistance to utilizing mathematical programming based algorithms for practical applications. There has been therefore always a gap between theory and practice in systems analysis in water quality management. This paper presents various heuristic algorithms to bridge this gap with supporting comparisons with dynamic programming based algorithms. These heuristics make a good use of the insight gained in the system's behaviour through experience, a process akin to the one adopted by field personnel and therefore can readily be understood by a user familiar with the system. Also they allow user preferences in decision making via on-line interaction. Experience has shown that these heuristics are indeed well founded and compare very favourably with the sophisticated dynamic programming algorithms. Two examples have been included which demonstrate such a success of the heuristic algorithms.


Sign in / Sign up

Export Citation Format

Share Document