scholarly journals Controlling hallucinations at word level in data-to-text generation

Author(s):  
Clement Rebuffel ◽  
Marco Roberti ◽  
Laure Soulier ◽  
Geoffrey Scoutheeten ◽  
Rossella Cancelliere ◽  
...  

AbstractData-to-Text Generation (DTG) is a subfield of Natural Language Generation aiming at transcribing structured data in natural language descriptions. The field has been recently boosted by the use of neural-based generators which exhibit on one side great syntactic skills without the need of hand-crafted pipelines; on the other side, the quality of the generated text reflects the quality of the training data, which in realistic settings only offer imperfectly aligned structure-text pairs. Consequently, state-of-art neural models include misleading statements –usually called hallucinations—in their outputs. The control of this phenomenon is today a major challenge for DTG, and is the problem addressed in the paper. Previous work deal with this issue at the instance level: using an alignment score for each table-reference pair. In contrast, we propose a finer-grained approach, arguing that hallucinations should rather be treated at the word level. Specifically, we propose a Multi-Branch Decoder which is able to leverage word-level labels to learn the relevant parts of each training instance. These labels are obtained following a simple and efficient scoring procedure based on co-occurrence analysis and dependency parsing. Extensive evaluations, via automated metrics and human judgment on the standard WikiBio benchmark, show the accuracy of our alignment labels and the effectiveness of the proposed Multi-Branch Decoder. Our model is able to reduce and control hallucinations, while keeping fluency and coherence in generated texts. Further experiments on a degraded version of ToTTo show that our model could be successfully used on very noisy settings.

Author(s):  
Ziran Li ◽  
Zibo Lin ◽  
Ning Ding ◽  
Hai-Tao Zheng ◽  
Ying Shen

Generating a textual description from a set of RDF triplets is a challenging task in natural language generation. Recent neural methods have become the mainstream for this task, which often generate sentences from scratch. However, due to the huge gap between the structured input and the unstructured output, the input triples alone are insufficient to decide an expressive and specific description. In this paper, we propose a novel anchor-to-prototype framework to bridge the gap between structured RDF triples and natural text. The model retrieves a set of prototype descriptions from the training data and extracts writing patterns from them to guide the generation process. Furthermore, to make a more precise use of the retrieved prototypes, we employ a triple anchor that aligns the input triples into groups so as to better match the prototypes. Experimental results on both English and Chinese datasets show that our method significantly outperforms the state-of-the-art baselines in terms of both automatic and manual evaluation, demonstrating the benefit of learning guidance from retrieved prototypes to facilitate triple-to-text generation.


Author(s):  
Ke Wang ◽  
Xiaojun Wan

Generating texts of different sentiment labels is getting more and more attention in the area of natural language generation. Recently, Generative Adversarial Net (GAN) has shown promising results in text generation. However, the texts generated by GAN usually suffer from the problems of poor quality, lack of diversity and mode collapse. In this paper, we propose a novel framework - SentiGAN, which has multiple generators and one multi-class discriminator, to address the above problems. In our framework, multiple generators are trained simultaneously, aiming at generating texts of different sentiment labels without supervision. We propose a penalty based objective in the generators to force each of them to generate diversified examples of a specific sentiment label. Moreover, the use of multiple generators and one multi-class discriminator can make each generator focus on generating its own examples of a specific sentiment label accurately. Experimental results on four datasets demonstrate that our model consistently outperforms several state-of-the-art text generation methods in the sentiment accuracy and quality of generated texts.


Information ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 511
Author(s):  
Karlo Babić ◽  
Sanda Martinčić-Ipšić ◽  
Ana Meštrović

In natural language processing, text needs to be transformed into a machine-readable representation before any processing. The quality of further natural language processing tasks greatly depends on the quality of those representations. In this survey, we systematize and analyze 50 neural models from the last decade. The models described are grouped by the architecture of neural networks as shallow, recurrent, recursive, convolutional, and attention models. Furthermore, we categorize these models by representation level, input level, model type, and model supervision. We focus on task-independent representation models, discuss their advantages and drawbacks, and subsequently identify the promising directions for future neural text representation models. We describe the evaluation datasets and tasks used in the papers that introduced the models and compare the models based on relevant evaluations. The quality of a representation model can be evaluated as its capability to generalize to multiple unrelated tasks. Benchmark standardization is visible amongst recent models and the number of different tasks models are evaluated on is increasing.


Author(s):  
Russell L. Steere ◽  
Eric F. Erbe ◽  
J. Michael Moseley

We have designed and built an electronic device which compares the resistance of a defined area of vacuum evaporated material with a variable resistor. When the two resistances are matched, the device automatically disconnects the primary side of the substrate transformer and stops further evaporation.This approach to controlled evaporation in conjunction with the modified guns and evaporation source permits reliably reproducible multiple Pt shadow films from a single Pt wrapped carbon point source. The reproducibility from consecutive C point sources is also reliable. Furthermore, the device we have developed permits us to select a predetermined resistance so that low contrast high-resolution shadows, heavy high contrast shadows, or any grade in between can be selected at will. The reproducibility and quality of results are demonstrated in Figures 1-4 which represent evaporations at various settings of the variable resistor.


Author(s):  
Margaret Jane Radin

Boilerplate—the fine-print terms and conditions that we become subject to when we click “I agree” online, rent an apartment, or enter an employment contract, for example—pervades all aspects of our modern lives. On a daily basis, most of us accept boilerplate provisions without realizing that should a dispute arise about a purchased good or service, the nonnegotiable boilerplate terms can deprive us of our right to jury trial and relieve providers of responsibility for harm. Boilerplate is the first comprehensive treatment of the problems posed by the increasing use of these terms, demonstrating how their use has degraded traditional notions of consent, agreement, and contract, and sacrificed core rights whose loss threatens the democratic order. This book examines attempts to justify the use of boilerplate provisions by claiming either that recipients freely consent to them or that economic efficiency demands them, and it finds these justifications wanting. It argues that our courts, legislatures, and regulatory agencies have fallen short in their evaluation and oversight of the use of boilerplate clauses. To improve legal evaluation of boilerplate, the book offers a new analytical framework, one that takes into account the nature of the rights affected, the quality of the recipient's consent, and the extent of the use of these terms. It goes on to offer possibilities for new methods of boilerplate evaluation and control, and concludes by discussing positive steps that NGOs, legislators, regulators, courts, and scholars could take to bring about better practices.


Author(s):  
V. V. Agafonov ◽  
V. Yu. Zalyadinov ◽  
M. E. Yusupov ◽  
N. S. Bikteeva

Sustainability of mining companies is of of high concern. The problem is specifically acute at companies that are monotownor monosettlement-forming. Sustainability of a mine depends in many ways on product quality and production resource-intensity. This article discusses formation of mineral quality indexes in terms of an open pit chrysotile mine. The studies took into account specific features of operation procedures implemented by each structural division of the mine. The analysis has found managerial and technological inconsistencies which affect quality and marketable product output, as well as efficiency of the mine in whole. The background for efficiency enhancement at a company is, by the authors’ opinion, consolidation of personnel subject to the single development strategy, namely: improvement of production and control efficiency, as well as use of available reserves and resources by means of better setup for production. The proposed approaches to planning mining operations and forming mineral quality allow higher quality of processing stock. In addition, a new model proposed for interaction between structural divisions of a mining company ensures improvement of general production indexes.


2020 ◽  
Vol 9 (2) ◽  
Author(s):  
Vũ Xuân Hùng

In the process of teaching, technical teaching facilities are both a content and a means of conveying information, they help the lecturer organize and control the students' cognitive activities, in addition, they also help students be interested in learning, practice practical skills from which to form active and creative learning methods. Teaching technology is one of the necessary conditions to help teachers carry out their related work of educating, teaching and bringing up, and intellectual development, arouse the inherent intelligence qualities of students. Currently, the management of technical teaching facilities at the Central Kindergartens College has been carried out on a regular basis and achieved certain results, but in fact, there are still many inadequacies. Finding a number of limitations in the management of teaching technical facilities, thereby proposing solutions to overcome those limitations, improve the efficiency of investment, preservation and use of teaching technical facilities in the trend of Industry Revolution 4.0, improving the quality of teaching at Central Kindergarten Pedagogy colleges in the current period is a very important and urgent task.


2019 ◽  
Author(s):  
Nur Tsalits Fahman Mughni

Teaching materials by integrating local culture makes easier for students to understand the subject matter in the learning process. The aims of the study is to measure the effectiveness of teaching materials based on local wisdom of agriculture in Binjai in improving the students problem solving abilities. The research method was a quasi experimental which use non equivalent control group in the pretest posttest design. The sample of study were students of Senior High School grade X in Binjai that consisted of experiment group which used teaching materials based on local wisdom of agriculture in Binjai and control group that used student handbooks. Teaching materials are tested by material experts and technology experts to ensure the quality of teaching materials. Data collection was conducted through test. The results showed that the teaching materials based on local wisdom of agriculture in Binjai effective in improving students problem solving abilities in the experimental group students based on the results of N gain value was 0.67 which has medium criteria. It means teaching materials based on agricultural local wisdom of agriculture in Binjai can be used as one of the teaching materials in learning activities.


Sign in / Sign up

Export Citation Format

Share Document