A unified alignment algorithm for bilingual data

AbstractThe paper presents a novel unified algorithm for aligning sentences with their translations in bilingual data. With the help of ideas from a stack-based dynamic programming decoder for speech recognition (Ney 1984), the search is parametrized in a novel way such that the unified algorithm can be used on various types of data that have been previously handled by separate implementations: the extracted text chunk pairs can be either sub-sentential pairs, one-to-one, or many-to-many sentence-level pairs. The one-stage search algorithm is carried out in a single run over the data. Its memory requirements are independent of the length of the source document, and it is applicable to sentence-level parallel as well as comparable data. With the help of a unified beam-search candidate pruning, the algorithm is very efficient: it avoids any document-level pre-filtering and uses less restrictive sentence-level filtering. Results are presented on a Russian–English, a Spanish–English, and an Arabic–English extraction task. Based on simple word-based scoring features, text chunk pairs are extracted out of several trillion candidates, where the search is carried out on 300 processors in parallel.

Download Full-text

A dynamic alignment algorithm for imperfect speech and transcript

Computer Science and Information Systems ◽

10.2298/csis1001075t ◽

2010 ◽

Vol 7 (1) ◽

pp. 75-84 ◽

Cited By ~ 4

Author(s):

Ye Tao ◽

Li Xueqing ◽

Wu Bian

Keyword(s):

Dynamic Programming ◽

Boundary Detection ◽

Multimedia Content ◽

Alignment Algorithm ◽

Optimal Alignment ◽

Multi Stage ◽

Sentence Level ◽

Sentence Boundary ◽

English Training ◽

Dynamic Alignment

This paper presents a novel alignment approach for imperfect speech and the corresponding transcription. The algorithm gets started with multi-stage sentence boundary detection in audio, followed by a dynamic programming based search, to find the optimal alignment and detect the mismatches at sentence level. Experiments show promising performance, compared to the traditional forced alignment approach. The proposed algorithm has already been applied in preparing multimedia content for an online English training platform.

Download Full-text

Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation

Computational Linguistics ◽

10.1162/089120103321337458 ◽

2003 ◽

Vol 29 (1) ◽

pp. 97-133 ◽

Cited By ~ 50

Author(s):

Christoph Tillmann ◽

Hermann Ney

Keyword(s):

Dynamic Programming ◽

Machine Translation ◽

Search Algorithm ◽

Statistical Machine Translation ◽

Target Language ◽

Search Procedure ◽

Beam Search ◽

Translation Model ◽

Novel Technique ◽

The Traveling Salesman Problem

In this article, we describe an efficient beam search algorithm for statistical machine translation based on dynamic programming (DP). The search algorithm uses the translation model presented in Brown et al. (1993). Starting from a DP-based solution to the traveling-salesman problem, we present a novel technique to restrict the possible word reorderings between source and target language in order to achieve an efficient search algorithm. Word reordering restrictions especially useful for the translation direction German to English are presented. The restrictions are generalized, and a set of four parameters to control the word reordering is introduced, which then can easily be adopted to new translation directions. The beam search procedure has been successfully tested on the Verbmobil task (German to English, 8,000-word vocabulary) and on the Canadian Hansards task (French to English, 100,000-word vocabulary). For the medium-sized Verbmobil task, a sentence can be translated in a few seconds, only a small number of search errors occur, and there is no performance degradation as measured by the word error criterion used in this article.

Download Full-text

A data-driven organization of the dynamic programming beam search for continuous speech recognition

10.1109/icassp.1987.1169844 ◽

2005 ◽

Cited By ~ 28

Author(s):

H. Ney ◽

D. Mergel ◽

A. Noll ◽

A. Paeseler

Keyword(s):

Dynamic Programming ◽

Speech Recognition ◽

Data Driven ◽

Beam Search ◽

Continuous Speech ◽

Continuous Speech Recognition

Download Full-text

A Recovering Beam Search algorithm for the one-machine dynamic total completion time scheduling problem

Journal of the Operational Research Society ◽

10.1057/palgrave.jors.2601389 ◽

2002 ◽

Vol 53 (11) ◽

pp. 1275-1280 ◽

Cited By ~ 53

Author(s):

F Della Croce ◽

V T'kindt

Keyword(s):

Completion Time ◽

Search Algorithm ◽

Total Completion Time ◽

Scheduling Problem ◽

Beam Search ◽

Time Scheduling ◽

The One ◽

One Machine

Download Full-text

Heuristic Algorithms for Waste Load Allocation in a River Basin

Water Science & Technology ◽

10.2166/wst.1989.0307 ◽

1989 ◽

Vol 21 (8-9) ◽

pp. 1057-1064 ◽

Cited By ~ 3

Author(s):

Vijay Joshi ◽

Prasad Modak

Keyword(s):

Dynamic Programming ◽

Heuristic Algorithms ◽

Water Quality Management ◽

User Preferences ◽

Theory And Practice ◽

Waste Load Allocation ◽

Practical Applications ◽

Waste Load ◽

On Line ◽

The One

Waste load allocation for rivers has been a topic of growing interest. Dynamic programming based algorithms are particularly attractive in this context and are widely reported in the literature. Codes developed for dynamic programming are however complex, require substantial computer resources and importantly do not allow interactions of the user. Further, there is always resistance to utilizing mathematical programming based algorithms for practical applications. There has been therefore always a gap between theory and practice in systems analysis in water quality management. This paper presents various heuristic algorithms to bridge this gap with supporting comparisons with dynamic programming based algorithms. These heuristics make a good use of the insight gained in the system's behaviour through experience, a process akin to the one adopted by field personnel and therefore can readily be understood by a user familiar with the system. Also they allow user preferences in decision making via on-line interaction. Experience has shown that these heuristics are indeed well founded and compare very favourably with the sophisticated dynamic programming algorithms. Two examples have been included which demonstrate such a success of the heuristic algorithms.

Download Full-text