scholarly journals Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems

2020 ◽  
Vol 8 ◽  
pp. 393-408
Author(s):  
Xuan Zhang ◽  
Kevin Duh

Hyperparameter selection is a crucial part of building neural machine translation (NMT) systems across both academia and industry. Fine-grained adjustments to a model’s architecture or training recipe can mean the difference between a positive and negative research result or between a state-of-the-art and underperforming system. While recent literature has proposed methods for automatic hyperparameter optimization (HPO), there has been limited work on applying these methods to neural machine translation (NMT), due in part to the high costs associated with experiments that train large numbers of model variants. To facilitate research in this space, we introduce a lookup-based approach that uses a library of pre-trained models for fast, low cost HPO experimentation. Our contributions include (1) the release of a large collection of trained NMT models covering a wide range of hyperparameters, (2) the proposal of targeted metrics for evaluating HPO methods on NMT, and (3) a reproducible benchmark of several HPO methods against our model library, including novel graph-based and multiobjective methods.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Teja Kattenborn ◽  
Jana Eichel ◽  
Fabian Ewald Fassnacht

AbstractRecent technological advances in remote sensing sensors and platforms, such as high-resolution satellite imagers or unmanned aerial vehicles (UAV), facilitate the availability of fine-grained earth observation data. Such data reveal vegetation canopies in high spatial detail. Efficient methods are needed to fully harness this unpreceded source of information for vegetation mapping. Deep learning algorithms such as Convolutional Neural Networks (CNN) are currently paving new avenues in the field of image analysis and computer vision. Using multiple datasets, we test a CNN-based segmentation approach (U-net) in combination with training data directly derived from visual interpretation of UAV-based high-resolution RGB imagery for fine-grained mapping of vegetation species and communities. We demonstrate that this approach indeed accurately segments and maps vegetation species and communities (at least 84% accuracy). The fact that we only used RGB imagery suggests that plant identification at very high spatial resolutions is facilitated through spatial patterns rather than spectral information. Accordingly, the presented approach is compatible with low-cost UAV systems that are easy to operate and thus applicable to a wide range of users.


2018 ◽  
Author(s):  
Spencer A. Arbuckle ◽  
Atsushi Yokoi ◽  
J. Andrew Pruszynski ◽  
Jörn Diedrichsen

AbstractFine-grained activity patterns, as measured with functional magnetic resonance imaging (fMRI), are thought to reflect underlying neural representations. Multivariate analysis techniques, such as representational similarity analysis (RSA), can be used to test models of brain representation by quantifying the representational geometry (the collection of pair-wise dissimilarities between activity patterns). One important caveat, however, is that non-linearities in the coupling between neural activity and the fMRI signal may lead to significant distortions in the representational geometry estimated from fMRI activity patterns. Here we tested the stability of representational dissimilarity measures in primary sensory-motor (S1 and M1) and early visual regions (V1/V2) across a large range of activation levels. Subjects were visually cued with different letters to perform single finger presses with one of the 5 fingers at a rate of 0.3-2.6 Hz. For each stimulation frequency, we quantified the difference between the 5 activity patterns in M1, S1, and V1/V2. We found that the representational geometry remained stable, even though the average activity increased over a large dynamic range. These results indicate that the representational geometry of fMRI activity patterns can be reliably assessed, largely independent of the average activity in the region. This has important methodological implications for RSA and other multivariate analysis approaches that use the representational geometry to make inferences about brain representations.


2018 ◽  
Vol 284 ◽  
pp. 171-176 ◽  
Author(s):  
Heeyoul Choi ◽  
Kyunghyun Cho ◽  
Yoshua Bengio

2020 ◽  
Vol 10 (4) ◽  
pp. 43
Author(s):  
Linda Alkhawaja ◽  
Hanan Ibrahim ◽  
Fida’ Ghnaim ◽  
Sirine Awwad

The neural machine translation (NMT) revolution is upon us. Since 2016, an increasing number of scientific publications have examined the improvements in the quality of machine translation (MT) systems. However, much remains to be done for specific language pairs, such as Arabic and English. This raises the question whether NMT is a useful tool for translating text from English to Arabic. For this purpose, 100 English passages were obtained from different broadcasting websites and translated using NMT in Google Translate. The NMT outputs were reviewed by three professional bilingual evaluators specializing in linguistics and translation, who scored the translations based on the translation quality assessment (QA) model. First, the evaluators identified the most common errors that appeared in the translated text. Next, they evaluated adequacy and fluency of MT using a 5-point scale. Our results indicate that mistranslation is the most common type of error, followed by corruption of the overall meaning of the sentence and orthographic errors. Nevertheless, adequacy and fluency of the translated text are of acceptable quality. The results of our research can be used to improve the quality of Google NMT output.


Informatics ◽  
2019 ◽  
Vol 6 (3) ◽  
pp. 41 ◽  
Author(s):  
Jennifer Vardaro ◽  
Moritz Schaeffer ◽  
Silvia Hansen-Schirra

This study aims to analyse how translation experts from the German department of the European Commission’s Directorate-General for Translation (DGT) identify and correct different error categories in neural machine translated texts (NMT) and their post-edited versions (NMTPE). The term translation expert encompasses translator, post-editor as well as revisor. Even though we focus on neural machine-translated segments, translator and post-editor are used synonymously because of the combined workflow using CAT-Tools as well as machine translation. Only the distinction between post-editor, which refers to a DGT translation expert correcting the neural machine translation output, and revisor, which refers to a DGT translation expert correcting the post-edited version of the neural machine translation output, is important and made clear whenever relevant. Using an automatic error annotation tool and the more fine-grained manual error annotation framework to identify characteristic error categories in the DGT texts, a corpus analysis revealed that quality assurance measures by post-editors and revisors of the DGT are most often necessary for lexical errors. More specifically, the corpus analysis showed that, if post-editors correct mistranslations, terminology or stylistic errors in an NMT sentence, revisors are likely to correct the same error type in the same post-edited sentence, suggesting that the DGT experts were being primed by the NMT output. Subsequently, we designed a controlled eye-tracking and key-logging experiment to compare participants’ eye movements for test sentences containing the three identified error categories (mistranslations, terminology or stylistic errors) and for control sentences without errors. We examined the three error types’ effect on early (first fixation durations, first pass durations) and late eye movement measures (e.g., total reading time and regression path durations). Linear mixed-effects regression models predict what kind of behaviour of the DGT experts is associated with the correction of different error types during the post-editing process.


2021 ◽  
Vol 1 (2) ◽  
Author(s):  
Ling Liu ◽  
Zach Ryan ◽  
Mans Hulden

Bibles are available in a wide range of languages, which provides valuable parallel text between languages since verses can be aligned accurately between all the different translations. How well can such data be utilized to train good neural machine translation (NMT) models? We are particularly interested in low-resource languages of high morphological complexity, and attempt to answer this question in the current work by training and evaluating Basque-English and Navajo-English MT models with the Transformer architecture. Different tokenization methods are applied, among which syllabification turns out to be most effective for Navajo and it is also good for Basque. Another additional data resource which can be potentially available for endangered languages is a dictionary of either word or phrase translations, thanks to linguists’ work on language documentation. Could this data be leveraged to augment Bible data for better performance? We experiment with different ways to utilize dictionary data, and find that word-to-word mapping translation with a word-pair dictionary is more effective than low-resource techniques such as backtranslation or adding dictionary data directly into the training set, though neither backtranslation nor word-to-word mapping translation produce improvements over using Bible data alone in our experiments.


2019 ◽  
Vol 50 (4) ◽  
pp. 693-702 ◽  
Author(s):  
Christine Holyfield ◽  
Sydney Brooks ◽  
Allison Schluterman

Purpose Augmentative and alternative communication (AAC) is an intervention approach that can promote communication and language in children with multiple disabilities who are beginning communicators. While a wide range of AAC technologies are available, little is known about the comparative effects of specific technology options. Given that engagement can be low for beginning communicators with multiple disabilities, the current study provides initial information about the comparative effects of 2 AAC technology options—high-tech visual scene displays (VSDs) and low-tech isolated picture symbols—on engagement. Method Three elementary-age beginning communicators with multiple disabilities participated. The study used a single-subject, alternating treatment design with each technology serving as a condition. Participants interacted with their school speech-language pathologists using each of the 2 technologies across 5 sessions in a block randomized order. Results According to visual analysis and nonoverlap of all pairs calculations, all 3 participants demonstrated more engagement with the high-tech VSDs than the low-tech isolated picture symbols as measured by their seconds of gaze toward each technology option. Despite the difference in engagement observed, there was no clear difference across the 2 conditions in engagement toward the communication partner or use of the AAC. Conclusions Clinicians can consider measuring engagement when evaluating AAC technology options for children with multiple disabilities and should consider evaluating high-tech VSDs as 1 technology option for them. Future research must explore the extent to which differences in engagement to particular AAC technologies result in differences in communication and language learning over time as might be expected.


2020 ◽  
Vol 7 (2) ◽  
pp. 34-41
Author(s):  
VLADIMIR NIKONOV ◽  
◽  
ANTON ZOBOV ◽  

The construction and selection of a suitable bijective function, that is, substitution, is now becoming an important applied task, particularly for building block encryption systems. Many articles have suggested using different approaches to determining the quality of substitution, but most of them are highly computationally complex. The solution of this problem will significantly expand the range of methods for constructing and analyzing scheme in information protection systems. The purpose of research is to find easily measurable characteristics of substitutions, allowing to evaluate their quality, and also measures of the proximity of a particular substitutions to a random one, or its distance from it. For this purpose, several characteristics were proposed in this work: difference and polynomial, and their mathematical expectation was found, as well as variance for the difference characteristic. This allows us to make a conclusion about its quality by comparing the result of calculating the characteristic for a particular substitution with the calculated mathematical expectation. From a computational point of view, the thesises of the article are of exceptional interest due to the simplicity of the algorithm for quantifying the quality of bijective function substitutions. By its nature, the operation of calculating the difference characteristic carries out a simple summation of integer terms in a fixed and small range. Such an operation, both in the modern and in the prospective element base, is embedded in the logic of a wide range of functional elements, especially when implementing computational actions in the optical range, or on other carriers related to the field of nanotechnology.


Sign in / Sign up

Export Citation Format

Share Document