scholarly journals Distilling the Knowledge of Large-scale Generative Models into Retrieval Models for Efficient Open-domain Conversation

Author(s):  
Beomsu Kim ◽  
Seokjun Seo ◽  
Seungju Han ◽  
Enkhbayar Erdenee ◽  
Buru Chang
Author(s):  
Yu Wu ◽  
Furu Wei ◽  
Shaohan Huang ◽  
Yunli Wang ◽  
Zhoujun Li ◽  
...  

Open domain response generation has achieved remarkable progress in recent years, but sometimes yields short and uninformative responses. We propose a new paradigm, prototypethen-edit for response generation, that first retrieves a prototype response from a pre-defined index and then edits the prototype response according to the differences between the prototype context and current context. Our motivation is that the retrieved prototype provides a good start-point for generation because it is grammatical and informative, and the post-editing process further improves the relevance and coherence of the prototype. In practice, we design a contextaware editing model that is built upon an encoder-decoder framework augmented with an editing vector. We first generate an edit vector by considering lexical differences between a prototype context and current context. After that, the edit vector and the prototype response representation are fed to a decoder to generate a new response. Experiment results on a large scale dataset demonstrate that our new paradigm significantly increases the relevance, diversity and originality of generation results, compared to traditional generative models. Furthermore, our model outperforms retrieval-based methods in terms of relevance and originality.


2022 ◽  
Vol 40 (1) ◽  
pp. 1-44
Author(s):  
Longxuan Ma ◽  
Mingda Li ◽  
Wei-Nan Zhang ◽  
Jiapeng Li ◽  
Ting Liu

Incorporating external knowledge into dialogue generation has been proven to benefit the performance of an open-domain Dialogue System (DS), such as generating informative or stylized responses, controlling conversation topics. In this article, we study the open-domain DS that uses unstructured text as external knowledge sources ( U nstructured T ext E nhanced D ialogue S ystem ( UTEDS )). The existence of unstructured text entails distinctions between UTEDS and traditional data-driven DS and we aim at analyzing these differences. We first give the definition of the UTEDS related concepts, then summarize the recently released datasets and models. We categorize UTEDS into Retrieval and Generative models and introduce them from the perspective of model components. The retrieval models consist of Fusion, Matching, and Ranking modules, while the generative models comprise Dialogue and Knowledge Encoding, Knowledge Selection (KS), and Response Generation modules. We further summarize the evaluation methods utilized in UTEDS and analyze the current models’ performance. At last, we discuss the future development trends of UTEDS, hoping to inspire new research in this field.


Author(s):  
Cao Liu ◽  
Shizhu He ◽  
Kang Liu ◽  
Jun Zhao

By reason of being able to obtain natural language responses, natural answers are more favored in real-world Question Answering (QA) systems. Generative models learn to automatically generate natural answers from large-scale question answer pairs (QA-pairs). However, they are suffering from the uncontrollable and uneven quality of QA-pairs crawled from the Internet. To address this problem, we propose a curriculum learning based framework for natural answer generation (CL-NAG), which is able to take full advantage of the valuable learning data from a noisy and uneven-quality corpus. Specifically, we employ two practical measures to automatically measure the quality (complexity) of QA-pairs. Based on the measurements, CL-NAG firstly utilizes simple and low-quality QA-pairs to learn a basic model, and then gradually learns to produce better answers with richer contents and more complete syntaxes based on more complex and higher-quality QA-pairs. In this way, all valuable information in the noisy and uneven-quality corpus could be fully exploited. Experiments demonstrate that CL-NAG outperforms the state-of-the-arts, which increases 6.8% and 8.7% in the accuracy for simple and complex questions, respectively.


Author(s):  
Alfio Massimiliano Gliozzo ◽  
Aditya Kalyanpur

Automatic open-domain Question Answering has been a long standing research challenge in the AI community. IBM Research undertook this challenge with the design of the DeepQA architecture and the implementation of Watson. This paper addresses a specific subtask of Deep QA, consisting of predicting the Lexical Answer Type (LAT) of a question. Our approach is completely unsupervised and is based on PRISMATIC, a large-scale lexical knowledge base automatically extracted from a Web corpus. Experiments on the Jeopardy! data shows that it is possible to correctly predict the LAT in a substantial number of questions. This approach can be used for general purpose knowledge acquisition tasks such as frame induction from text.


2019 ◽  
Vol 3 (4) ◽  
pp. 902-904
Author(s):  
Alexander Peyser ◽  
Sandra Diaz Pier ◽  
Wouter Klijn ◽  
Abigail Morrison ◽  
Jochen Triesch

Large-scale in silico experimentation depends on the generation of connectomes beyond available anatomical structure. We suggest that linking research across the fields of experimental connectomics, theoretical neuroscience, and high-performance computing can enable a new generation of models bridging the gap between biophysical detail and global function. This Focus Feature on ”Linking Experimental and Computational Connectomics” aims to bring together some examples from these domains as a step toward the development of more comprehensive generative models of multiscale connectomes.


2021 ◽  
Author(s):  
Fergus Imrie ◽  
Thomas E. Hadfield ◽  
Anthony R. Bradley ◽  
Charlotte M. Deane

AbstractGenerative models have increasingly been proposed as a solution to the molecular design problem. However, it has proved challenging to control the design process or incorporate prior knowledge, limiting their practical use in drug discovery. In particular, generative methods have made limited use of three-dimensional (3D) structural information even though this is critical to binding. This work describes a method to incorporate such information and demonstrates the benefit of doing so. We combine an existing graph-based deep generative model, DeLinker, with a convolutional neural network to utilise physically-meaningful 3D representations of molecules and target pharmacophores. We apply our model, DEVELOP, to both linker and R-group design, demonstrating its suitability for both hit-to-lead and lead optimisation. The 3D pharmacophoric information results in improved generation and allows greater control of the design process. In multiple large-scale evaluations, we show that including 3D pharmacophoric constraints results in substantial improvements in the quality of generated molecules. On a challenging test set derived from PDBbind, our model improves the proportion of generated molecules with high 3D similarity to the original molecule by over 300%. In addition, DEVELOP recovers 10 × more of the original molecules compared to the base-line DeLinker method. Our approach is general-purpose, readily modifiable to alternate 3D representations, and can be incorporated into other generative frameworks. Code is available at https://github.com/oxpig/DEVELOP.


2020 ◽  
Author(s):  
Jonathan Frazer ◽  
Pascal Notin ◽  
Mafalda Dias ◽  
Aidan Gomez ◽  
Kelly Brock ◽  
...  

AbstractQuantifying the pathogenicity of protein variants in human disease-related genes would have a profound impact on clinical decisions, yet the overwhelming majority (over 98%) of these variants still have unknown consequences1–3. In principle, computational methods could support the large-scale interpretation of genetic variants. However, prior methods4–7 have relied on training machine learning models on available clinical labels. Since these labels are sparse, biased, and of variable quality, the resulting models have been considered insufficiently reliable8. By contrast, our approach leverages deep generative models to predict the clinical significance of protein variants without relying on labels. The natural distribution of protein sequences we observe across organisms is the result of billions of evolutionary experiments9,10. By modeling that distribution, we implicitly capture constraints on the protein sequences that maintain fitness. Our model EVE (Evolutionary model of Variant Effect) not only outperforms computational approaches that rely on labelled data, but also performs on par, if not better than, high-throughput assays which are increasingly used as strong evidence for variant classification11–23. After thorough validation on clinical labels, we predict the pathogenicity of 11 million variants across 1,081 disease genes, and assign high-confidence reclassification for 72k Variants of Unknown Significance8. Our work suggests that models of evolutionary information can provide a strong source of independent evidence for variant interpretation and that the approach will be widely useful in research and clinical settings.


2019 ◽  
Vol 53 (2) ◽  
pp. 97-97
Author(s):  
Qingyao Ai

Information Retrieval (IR) concerns about the structure, analysis, organization, storage, and retrieval of information. Among different retrieval models proposed in the past decades, generative retrieval models, especially those under the statistical probabilistic framework, are one of the most popular techniques that have been widely applied to Information Retrieval problems. While they are famous for their well-grounded theory and good empirical performance in text retrieval, their applications in IR are often limited by their complexity and low extendability in the modeling of high-dimensional information. Recently, advances in deep learning techniques provide new opportunities for representation learning and generative models for information retrieval. In contrast to statistical models, neural models have much more flexibility because they model information and data correlation in latent spaces without explicitly relying on any prior knowledge. Previous studies on pattern recognition and natural language processing have shown that semantically meaningful representations of text, images, and many types of information can be acquired with neural models through supervised or unsupervised training. Nonetheless, the effectiveness of neural models for information retrieval is mostly unexplored. In this thesis, we study how to develop new generative models and representation learning frameworks with neural models for information retrieval. Specifically, our contributions include three main components: (1) Theoretical Analysis : We present the first theoretical analysis and adaptation of existing neural embedding models for ad-hoc retrieval tasks; (2) Design Practice : Based on our experience and knowledge, we show how to design an embedding-based neural generative model for practical information retrieval tasks such as personalized product search; And (3) Generic Framework : We further generalize our proposed neural generative framework for complicated heterogeneous information retrieval scenarios that concern text, images, knowledge entities, and their relationships. Empirical results show that the proposed neural generative framework can effectively learn information representations and construct retrieval models that outperform the state-of-the-art systems in a variety of IR tasks.


Sign in / Sign up

Export Citation Format

Share Document