scholarly journals Non-Autoregressive Machine Translation with Auxiliary Regularization

Author(s):  
Yiren Wang ◽  
Fei Tian ◽  
Di He ◽  
Tao Qin ◽  
ChengXiang Zhai ◽  
...  

As a new neural machine translation approach, NonAutoregressive machine Translation (NAT) has attracted attention recently due to its high efficiency in inference. However, the high efficiency has come at the cost of not capturing the sequential dependency on the target side of translation, which causes NAT to suffer from two kinds of translation errors: 1) repeated translations (due to indistinguishable adjacent decoder hidden states), and 2) incomplete translations (due to incomplete transfer of source side information via the decoder hidden states). In this paper, we propose to address these two problems by improving the quality of decoder hidden representations via two auxiliary regularization terms in the training process of an NAT model. First, to make the hidden states more distinguishable, we regularize the similarity between consecutive hidden states based on the corresponding target tokens. Second, to force the hidden states to contain all the information in the source sentence, we leverage the dual nature of translation tasks (e.g., English to German and German to English) and minimize a backward reconstruction error to ensure that the hidden states of the NAT decoder are able to recover the source side sentence. Extensive experiments conducted on several benchmark datasets show that both regularization strategies are effective and can alleviate the issues of repeated translations and incomplete translations in NAT models. The accuracy of NAT models is therefore improved significantly over the state-of-the-art NAT models with even better efficiency for inference.

Author(s):  
Xiangpeng Wei ◽  
Yue Hu ◽  
Luxi Xing ◽  
Yipeng Wang ◽  
Li Gao

The dominant neural machine translation (NMT) models that based on the encoder-decoder architecture have recently achieved the state-of-the-art performance. Traditionally, the NMT models only depend on the representations learned during training for mapping a source sentence into the target domain. However, the learned representations often suffer from implicit and inadequately informed properties. In this paper, we propose a novel bilingual topic enhanced NMT (BLTNMT) model to improve translation performance by incorporating bilingual topic knowledge into NMT. Specifically, the bilingual topic knowledge is included into the hidden states of both encoder and decoder, as well as the attention mechanism. With this new setting, the proposed BLT-NMT has access to the background knowledge implied in bilingual topics which is beyond the sequential context, and enables the attention mechanism to attend to topic-level attentions for generating accurate target words during translation. Experimental results show that the proposed model consistently outperforms the traditional RNNsearch and the previous topic-informed NMT on Chinese-English and EnglishGerman translation tasks. We also introduce the bilingual topic knowledge into the newly emerged Transformer base model on English-German translation and achieve a notable improvement.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Muhammad Ashar Naveed ◽  
Muhammad Afnan Ansari ◽  
Inki Kim ◽  
Trevon Badloe ◽  
Joohoon Kim ◽  
...  

AbstractHelicity-multiplexed metasurfaces based on symmetric spin–orbit interactions (SOIs) have practical limits because they cannot provide central-symmetric holographic imaging. Asymmetric SOIs can effectively address such limitations, with several exciting applications in various fields ranging from asymmetric data inscription in communications to dual side displays in smart mobile devices. Low-loss dielectric materials provide an excellent platform for realizing such exotic phenomena efficiently. In this paper, we demonstrate an asymmetric SOI-dependent transmission-type metasurface in the visible domain using hydrogenated amorphous silicon (a-Si:H) nanoresonators. The proposed design approach is equipped with an additional degree of freedom in designing bi-directional helicity-multiplexed metasurfaces by breaking the conventional limit imposed by the symmetric SOI in half employment of metasurfaces for one circular handedness. Two on-axis, distinct wavefronts are produced with high transmission efficiencies, demonstrating the concept of asymmetric wavefront generation in two antiparallel directions. Additionally, the CMOS compatibility of a-Si:H makes it a cost-effective alternative to gallium nitride (GaN) and titanium dioxide (TiO2) for visible light. The cost-effective fabrication and simplicity of the proposed design technique provide an excellent candidate for high-efficiency, multifunctional, and chip-integrated demonstration of various phenomena.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Xujun Zhang ◽  
Chao Shen ◽  
Xueying Guo ◽  
Zhe Wang ◽  
Gaoqi Weng ◽  
...  

AbstractVirtual screening (VS) based on molecular docking has emerged as one of the mainstream technologies of drug discovery due to its low cost and high efficiency. However, the scoring functions (SFs) implemented in most docking programs are not always accurate enough and how to improve their prediction accuracy is still a big challenge. Here, we propose an integrated platform called ASFP, a web server for the development of customized SFs for structure-based VS. There are three main modules in ASFP: (1) the descriptor generation module that can generate up to 3437 descriptors for the modelling of protein–ligand interactions; (2) the AI-based SF construction module that can establish target-specific SFs based on the pre-generated descriptors through three machine learning (ML) techniques; (3) the online prediction module that provides some well-constructed target-specific SFs for VS and an additional generic SF for binding affinity prediction. Our methodology has been validated on several benchmark datasets. The target-specific SFs can achieve an average ROC AUC of 0.973 towards 32 targets and the generic SF can achieve the Pearson correlation coefficient of 0.81 on the PDBbind version 2016 core set. To sum up, the ASFP server is a powerful tool for structure-based VS.


2012 ◽  
Vol 239-240 ◽  
pp. 1522-1527
Author(s):  
Wen Bo Wu ◽  
Yu Fu Jia ◽  
Hong Xing Sun

The bottleneck assignment (BA) and the generalized assignment (GA) problems and their exact solutions are explored in this paper. Firstly, a determinant elimination (DE) method is proposed based on the discussion of the time and space complexity of the enumeration method for both BA and GA problems. The optimization algorithm to the pre-assignment problem is then discussed and the adjusting and transformation to the cost matrix is adopted to reduce the computational complexity of the DE method. Finally, a synthesis method for both BA and GA problems is presented. The numerical experiments are carried out and the results indicate that the proposed method is feasible and of high efficiency.


2011 ◽  
Vol 204-210 ◽  
pp. 1415-1418
Author(s):  
De Jiang Zhang ◽  
Na Na Dong ◽  
Xiao Mei Lin

By studying the conventional algorithm of contour extraction, a new method of contour extraction in blood vessel of brain is proposed based on the MOC maximum optimization cost. First of all, the theory computes the gray differential of the image by conventional differential method to build the cost space. Then, by using dynamic programming theory, the maximum optimization cost curve in the space is extracted to serve as the specific cerebrovascular profile. The experiments show that this method ensures high efficiency in extracting cerebrovascular contour and a high accuracy in positioning cerebrovascular contour, and it diminishes the target image ambiguity caused by noise to improve the anti-interference ability of Contour extraction.


2015 ◽  
Vol 1094 ◽  
pp. 445-450 ◽  
Author(s):  
Wei Chen ◽  
Hong Hao Ma ◽  
Zhao Wu Shen ◽  
De Bao Wang

Due to the inefficiency problem of the cut blasting in rock excavation and rock breaking, a shell radial shaped charge device was proposed based of the ideal of ‘cutting to slotting’ and it was validated through experiments. For this device, the shell materials are used to be shaped materials and multiple shaped ring is designed on the circular tube. It can not only reduce charge quantity but also raise the utilization ratio of explosive energy. After explosion, multiple radial shaped charge jets can be formed along the axial line in sequence and then the surrounding rock mass would be cracked. The crack network will be formed along with the further extension of the fraction under the action of quasi static loading of detonation gas. The shell radial shaped charge device was tested through the cut blasting model experiment. Experimental results show that the utilization ratio of blasting hole nearly approaches 98% with this device. The blasting efficiency and cyclical footage can be improved effectively and the cost of drifting can also be reduced.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-33
Author(s):  
Jingjing Wang ◽  
Wenjun Jiang ◽  
Kenli Li ◽  
Keqin Li

CANDECOMP/PARAFAC (CP) decomposition is widely used in various online social network (OSN) applications. However, it is inefficient when dealing with massive and incremental data. Some incremental CP decomposition (ICP) methods have been proposed to improve the efficiency and process evolving data, by updating decomposition results according to the newly added data. The ICP methods are efficient, but inaccurate because of serious error accumulation caused by approximation in the incremental updating. To promote the wide use of ICP, we strive to reduce its cumulative errors while keeping high efficiency. We first differentiate all possible errors in ICP into two types: the cumulative reconstruction error and the prediction error. Next, we formulate two optimization problems for reducing the two errors. Then, we propose several restarting strategies to address the two problems. Finally, we test the effectiveness in three typical dynamic OSN applications. To the best of our knowledge, this is the first work on reducing the cumulative errors of the ICP methods in dynamic OSNs.


2020 ◽  
Vol 34 (05) ◽  
pp. 7839-7846
Author(s):  
Junliang Guo ◽  
Xu Tan ◽  
Linli Xu ◽  
Tao Qin ◽  
Enhong Chen ◽  
...  

Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same model configurations, a natural idea to improve the accuracy of NAT models is to transfer a well-trained AT model to an NAT model through fine-tuning. However, since AT and NAT models differ greatly in training strategy, straightforward fine-tuning does not work well. In this work, we introduce curriculum learning into fine-tuning for NAT. Specifically, we design a curriculum in the fine-tuning process to progressively switch the training from autoregressive generation to non-autoregressive generation. Experiments on four benchmark translation datasets show that the proposed method achieves good improvement (more than 1 BLEU score) over previous NAT baselines in terms of translation accuracy, and greatly speed up (more than 10 times) the inference process over AT baselines.


2019 ◽  
Vol 22 (64) ◽  
pp. 63-84
Author(s):  
JanapatyI Naga Muneiah ◽  
Ch D V SubbaRao

Enterprises often classify their customers based on the degree of profitability in decreasing order like C1, C2, ..., Cn. Generally, customers representing class Cn are zero profitable since they migrate to the competitor. They are called as attritors (or churners) and are the prime reason for the huge losses of the enterprises. Nevertheless, customers of other intermediary classes are reluctant and offer an insignificant amount of profits in different degrees and lead to uncertainty. Various data mining models like decision trees, etc., which are built using the customers’ profiles, are limited to classifying the customers as attritors or non-attritors only and not providing profitable actionable knowledge. In this paper, we present an efficient algorithm for the automatic extraction of profit-maximizing knowledge for business applications with multi-class customers by postprocessing the probability estimation decision tree (PET). When the PET predicts a customer as belonging  to any of the lesser profitable classes, then, our algorithm suggests the cost-sensitive actions to change her/him to a maximum possible higher profitable status. In the proposed novel approach, the PET is represented in the compressed form as a Bit patterns matrix and the postprocessing task is performed on the bit patterns by applying the bitwise AND operations. The computational performance of the proposed method is strong due to the employment of effective data structures. Substantial experiments conducted on UCI datasets, real Mobile phone service data and other benchmark datasets demonstrate that the proposed method remarkably outperforms the state-of-the-art methods.


2019 ◽  
Vol 22 (4) ◽  
pp. 329-334
Author(s):  
Noora Saad Faraj Al-Dulaimi ◽  
Samara Saad Faraj Al-Dulaimi

Providing a clean and high quality drinking water to both rural as well as urban areas is a great challenge by itself, adding to it the large volume requirements of such water at high population areas means a very high cost for such industry because mainly of the cost of expensive commercially available adsorbent used in this process. This led inhabitants of the remote and/or rural areas to use less quality water with all its risks and health challenges. In this study, a locally collected rice husk is tested to be used as an alternative adsorbent to the expensive common commercial ones. Parameters like adsorbent dosage, initial concentration of turbidity, and pH level were tested to investigate their effects on the process. Treatment of synthetic turbid water was done after changing these parameters to measure the effect of each parameter alone and the results showed a set of parameters that can be used to achieve high efficiency of turbidity removal. The study concluded that rice husk can be used as a well cheap alternative adsorbent to reduce the river water turbidity due to its availability and low cost with a decent removal efficiency approaching 95%.


Sign in / Sign up

Export Citation Format

Share Document