scholarly journals Erratum: Measuring and Improving Consistency in Pretrained Language Models

2021 ◽  
Vol 9 ◽  
pp. 1407-1407
Author(s):  
Yanai Elazar ◽  
Nora Kassner ◽  
Shauli Ravfogel ◽  
Abhilasha Ravichander ◽  
Eduard Hovy ◽  
...  

Abstract During production of this paper, an error was introduced to the formula on the bottom of the right column of page 1020. In the last two terms of the formula, the n and m subscripts were swapped. The correct formula is:Lc=∑n=1k∑m=n+1kDKL(Qnri∥Qmri)+DKL(Qmri∥Qnri)The paper has been updated.

2020 ◽  
Vol 8 ◽  
pp. 621-633
Author(s):  
Lifu Tu ◽  
Garima Lalwani ◽  
Spandana Gella ◽  
He He

Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset. Intrigued by these results, we find that the key to their success is generalization from a small amount of counterexamples where the spurious correlations do not hold. When such minority examples are scarce, pre-trained models perform as poorly as models trained from scratch. In the case of extreme minority, we propose to use multi-task learning (MTL) to improve generalization. Our experiments on natural language inference and paraphrase identification show that MTL with the right auxiliary tasks significantly improves performance on challenging examples without hurting the in-distribution performance. Further, we show that the gain from MTL mainly comes from improved generalization from the minority examples. Our results highlight the importance of data diversity for overcoming spurious correlations. 1


2014 ◽  
Vol 40 (1) ◽  
pp. 85-120 ◽  
Author(s):  
Fei Huang ◽  
Arun Ahuja ◽  
Doug Downey ◽  
Yi Yang ◽  
Yuhong Guo ◽  
...  

Finding the right representations for words is critical for building accurate NLP systems when domain-specific labeled data for the task is scarce. This article investigates novel techniques for extracting features from n-gram models, Hidden Markov Models, and other statistical language models, including a novel Partial Lattice Markov Random Field model. Experiments on part-of-speech tagging and information extraction, among other tasks, indicate that features taken from statistical language models, in combination with more traditional features, outperform traditional representations alone, and that graphical model representations outperform n-gram models, especially on sparse and polysemous words.


Author(s):  
Arne Zeschel

AbstractConstruction-based language models assume that grammar is meaningful and learnable from experience. Focusing on five of the most elementary argument structure constructions of English, a large-scale corpus study of child-directed speech (CDS) investigates exactly which meanings/functions are associated with these patterns in CDS, and whether they are indeed specially indicated to children by their caretakers (as suggested by previous research, cf. Goldberg, Casenhiser and Sethuraman 2004). Collostructional analysis (Stefanowitsch and Gries 2003) is employed to uncover significantly attracted verb-construction combinations, and attracted pairs are classified semantically in order to systematise the attested usage patterns of the target constructions. The results indicate that the structure of the input may aid learners in making the right generalisations about constructional usage patterns, but such scaffolding is not strictly necessary for construction learning: not all argument structure constructions are coherently semanticised to the same extent (in the sense that they designate a single schematic event type of the kind envisioned in Goldberg’s [1995] ‘scene encoding hypothesis’), and they also differ in the extent to which individual semantic subtypes predominate in learners’ input


2021 ◽  
Vol 12 (1) ◽  
pp. 111
Author(s):  
Sia Gholami ◽  
Mehdi Noori

Open-book question answering is a subset of question answering (QA) tasks where the system aims to find answers in a given set of documents (open-book) and common knowledge about a topic. This article proposes a solution for answering natural language questions from a corpus of Amazon Web Services (AWS) technical documents with no domain-specific labeled data (zero-shot). These questions have a yes–no–none answer and a text answer which can be short (a few words) or long (a few sentences). We present a two-step, retriever–extractor architecture in which a retriever finds the right documents and an extractor finds the answers in the retrieved documents. To test our solution, we are introducing a new dataset for open-book QA based on real customer questions on AWS technical documentation. In this paper, we conducted experiments on several information retrieval systems and extractive language models, attempting to find the yes–no–none answers and text answers in the same pass. Our custom-built extractor model is created from a pretrained language model and fine-tuned on the the Stanford Question Answering Dataset—SQuAD and Natural Questions datasets. We were able to achieve 42% F1 and 39% exact match score (EM) end-to-end with no domain-specific training.


Author(s):  
J. Anthony VanDuzer

SummaryRecently, there has been a proliferation of international agreements imposing minimum standards on states in respect of their treatment of foreign investors and allowing investors to initiate dispute settlement proceedings where a state violates these standards. Of greatest significance to Canada is Chapter 11 of the North American Free Trade Agreement, which provides both standards for state behaviour and the right to initiate binding arbitration. Since 1996, four cases have been brought under Chapter 11. This note describes the Chapter 11 process and suggests some of the issues that may arise as it is increasingly resorted to by investors.


2019 ◽  
Vol 42 ◽  
Author(s):  
Guido Gainotti

Abstract The target article carefully describes the memory system, centered on the temporal lobe that builds specific memory traces. It does not, however, mention the laterality effects that exist within this system. This commentary briefly surveys evidence showing that clear asymmetries exist within the temporal lobe structures subserving the core system and that the right temporal structures mainly underpin face familiarity feelings.


Author(s):  
J. Taft∅

It is well known that for reflections corresponding to large interplanar spacings (i.e., sin θ/λ small), the electron scattering amplitude, f, is sensitive to the ionicity and to the charge distribution around the atoms. We have used this in order to obtain information about the charge distribution in FeTi, which is a candidate for storage of hydrogen. Our goal is to study the changes in electron distribution in the presence of hydrogen, and also the ionicity of hydrogen in metals, but so far our study has been limited to pure FeTi. FeTi has the CsCl structure and thus Fe and Ti scatter with a phase difference of π into the 100-ref lections. Because Fe (Z = 26) is higher in the periodic system than Ti (Z = 22), an immediate “guess” would be that Fe has a larger scattering amplitude than Ti. However, relativistic Hartree-Fock calculations show that the opposite is the case for the 100-reflection. An explanation for this may be sought in the stronger localization of the d-electrons of the first row transition elements when moving to the right in the periodic table. The tabulated difference between fTi (100) and ffe (100) is small, however, and based on the values of the scattering amplitude for isolated atoms, the kinematical intensity of the 100-reflection is only 5.10-4 of the intensity of the 200-reflection.


Author(s):  
Russell L. Steere ◽  
Michael Moseley

A redesigned specimen holder and cap have made possible the freeze-etching of both fracture surfaces of a frozen fractured specimen. In principal, the procedure involves freezing a specimen between two specimen holders (as shown in A, Fig. 1, and the left side of Fig. 2). The aluminum specimen holders and brass cap are constructed so that the upper specimen holder can be forced loose, turned over, and pressed down firmly against the specimen stage to a position represented by B, Fig. 1, and the right side of Fig. 2.


Sign in / Sign up

Export Citation Format

Share Document