scholarly journals An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models

2020 ◽  
Vol 8 ◽  
pp. 621-633
Author(s):  
Lifu Tu ◽  
Garima Lalwani ◽  
Spandana Gella ◽  
He He

Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset. Intrigued by these results, we find that the key to their success is generalization from a small amount of counterexamples where the spurious correlations do not hold. When such minority examples are scarce, pre-trained models perform as poorly as models trained from scratch. In the case of extreme minority, we propose to use multi-task learning (MTL) to improve generalization. Our experiments on natural language inference and paraphrase identification show that MTL with the right auxiliary tasks significantly improves performance on challenging examples without hurting the in-distribution performance. Further, we show that the gain from MTL mainly comes from improved generalization from the minority examples. Our results highlight the importance of data diversity for overcoming spurious correlations. 1

2020 ◽  
Vol 46 (2) ◽  
pp. 487-497 ◽  
Author(s):  
Malvina Nissim ◽  
Rik van Noord ◽  
Rob van der Goot

Analogies such as man is to king as woman is to X are often used to illustrate the amazing power of word embeddings. Concurrently, they have also been used to expose how strongly human biases are encoded in vector spaces trained on natural language, with examples like man is to computer programmer as woman is to homemaker. Recent work has shown that analogies are in fact not an accurate diagnostic for bias, but this does not mean that they are not used anymore, or that their legacy is fading. Instead of focusing on the intrinsic problems of the analogy task as a bias detection tool, we discuss a series of issues involving implementation as well as subjective choices that might have yielded a distorted picture of bias in word embeddings. We stand by the truth that human biases are present in word embeddings, and, of course, the need to address them. But analogies are not an accurate tool to do so, and the way they have been most often used has exacerbated some possibly non-existing biases and perhaps hidden others. Because they are still widely popular, and some of them have become classics within and outside the NLP community, we deem it important to provide a series of clarifications that should put well-known, and potentially new analogies, into the right perspective.


Electronics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 845
Author(s):  
Danbi Cho ◽  
Hyunyoung Lee ◽  
Seungshik Kang

It is important how the token unit is defined in a sentence in natural language process tasks, such as text classification, machine translation, and generation. Many studies recently utilized the subword tokenization in language models such as BERT, KoBERT, and ALBERT. Although these language models achieved state-of-the-art results in various NLP tasks, it is not clear whether the subword tokenization is the best token unit for Korean sentence embedding. Thus, we carried out sentence embedding based on word, morpheme, subword, and submorpheme, respectively, on Korean sentiment analysis. We explored the two-sentence representation methods for sentence embedding: considering the order of tokens in a sentence and not considering the order. While inputting a sentence, which is decomposed by token unit, to the two-sentence representation methods, we construct the sentence embedding with various tokenizations to find the most effective token unit for Korean sentence embedding. In our work, we confirmed: the robustness of the subword unit for out-of-vocabulary (OOV) problems compared to other token units, the disadvantage of replacing whitespace with a particular symbol in the sentiment analysis task, and that the optimal vocabulary size is 16K in subword and submorpheme tokenization. We empirically noticed that the subword, which was tokenized by a vocabulary size of 16K without replacement of whitespace, was the most effective for sentence embedding on the Korean sentiment analysis task.


Author(s):  
Claire Voisin

This book provides an introduction to algebraic cycles on complex algebraic varieties, to the major conjectures relating them to cohomology, and even more precisely to Hodge structures on cohomology. The book is intended for both students and researchers, and not only presents a survey of the geometric methods developed in the last thirty years to understand the famous Bloch-Beilinson conjectures, but also examines recent work by the author. It focuses on two central objects: the diagonal of a variety—and the partial Bloch-Srinivas type decompositions it may have depending on the size of Chow groups—as well as its small diagonal, which is the right object to consider in order to understand the ring structure on Chow groups and cohomology. An exploration of a sampling of recent works by the author looks at the relation, conjectured in general by Bloch and Beilinson, between the coniveau of general complete intersections and their Chow groups and a very particular property satisfied by the Chow ring of K3 surfaces and conjecturally by hyper-Kähler manifolds. In particular, the book delves into arguments originating in Nori's work that have been further developed by others.


Author(s):  
Rohan Pandey ◽  
Vaibhav Gautam ◽  
Ridam Pal ◽  
Harsh Bandhey ◽  
Lovedeep Singh Dhingra ◽  
...  

BACKGROUND The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this ‘Infodemic’ requires strong health messaging systems that are engaging, vernacular, scalable, effective and continuously learn the new patterns of misinformation. OBJECTIVE We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational AI, machine translation and natural language processing. WashKaro provides the right information matched against WHO guidelines through AI, and delivers it in the right format in local languages. METHODS We theorize (i) an NLP based AI engine that could continuously incorporate user feedback to improve relevance of information, (ii) bite sized audio in the local language to improve penetrance in a country with skewed gender literacy ratios, and (iii) conversational but interactive AI engagement with users towards an increased health awareness in the community. RESULTS A total of 5026 people who downloaded the app during the study window, among those 1545 were active users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of an mHealth platform to mitigate health misinformation. CONCLUSIONS We conclude that a multi-pronged machine learning application delivering vernacular bite-sized audios and conversational AI is an effective approach to mitigate health misinformation. CLINICALTRIAL Not Applicable


2016 ◽  
Vol 1 (1) ◽  
pp. 50-53 ◽  
Author(s):  
Varun Sharma ◽  
Narpat Singh

In the recent research work, the handwritten signature is a suitable field to detection of valid signature from different environment such online signature and offline signature. In early research work, a lot of unauthorized person put the signature and theft the data in illegal manner from organization or industries. So we have to need identify, the right person on the basis of various parameters that can be detected. In this paper, we have proposed two methods namely LDA and Neural Network for the offline signature from the scan signature image. For efficient research, we have focused the comparative analysis in terms of FRR, SSIM, MSE, and PSNR. These parameters are compared with the early work and the recent work. Our proposed work is more effective and provides the suitable result through our method which leads to existing work. Our method will help to find legal signature of authorized use for security and avoid illegal work.


Archaeologia ◽  
1844 ◽  
Vol 30 ◽  
pp. 138-143
Author(s):  
Evan Nepean

The public curiosity has been much roused by Stevens's recent work on Central America, as well as by the late visit of Mr. Walker and Captain Caddy of the Artillery, to Palenque, and other ancient cities in that quarter.I have the honour to acquaint your Lordship, as President of the Society of Antiquaries, that, having been lately on service in the gulph of Mexico, whilst laying off the island of Sacrificios, I caused several excavations to be made there, and succeeded in digging up various articles of pottery, idols, and musical instruments; amongst other specimens, are three or four types, or signets, with hieroglyphics, which may perhaps throw some light on the origin of the Mexicans, or the still more ancient race that preceded them.


Author(s):  
Santiago Zanella-Béguelin ◽  
Lukas Wutschitz ◽  
Shruti Tople ◽  
Victor Rühle ◽  
Andrew Paverd ◽  
...  

Science ◽  
2021 ◽  
Vol 371 (6526) ◽  
pp. 284-288 ◽  
Author(s):  
Brian Hie ◽  
Ellen D. Zhong ◽  
Bonnie Berger ◽  
Bryan Bryson

The ability for viruses to mutate and evade the human immune system and cause infection, called viral escape, remains an obstacle to antiviral and vaccine development. Understanding the complex rules that govern escape could inform therapeutic design. We modeled viral escape with machine learning algorithms originally developed for human natural language. We identified escape mutations as those that preserve viral infectivity but cause a virus to look different to the immune system, akin to word changes that preserve a sentence’s grammaticality but change its meaning. With this approach, language models of influenza hemagglutinin, HIV-1 envelope glycoprotein (HIV Env), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Spike viral proteins can accurately predict structural escape patterns using sequence data alone. Our study represents a promising conceptual bridge between natural language and viral evolution.


2021 ◽  
Vol 11 (7) ◽  
pp. 3095
Author(s):  
Suhyune Son ◽  
Seonjeong Hwang ◽  
Sohyeun Bae ◽  
Soo Jun Park ◽  
Jang-Hwan Choi

Multi-task learning (MTL) approaches are actively used for various natural language processing (NLP) tasks. The Multi-Task Deep Neural Network (MT-DNN) has contributed significantly to improving the performance of natural language understanding (NLU) tasks. However, one drawback is that confusion about the language representation of various tasks arises during the training of the MT-DNN model. Inspired by the internal-transfer weighting of MTL in medical imaging, we introduce a Sequential and Intensive Weighted Language Modeling (SIWLM) scheme. The SIWLM consists of two stages: (1) Sequential weighted learning (SWL), which trains a model to learn entire tasks sequentially and concentrically, and (2) Intensive weighted learning (IWL), which enables the model to focus on the central task. We apply this scheme to the MT-DNN model and call this model the MTDNN-SIWLM. Our model achieves higher performance than the existing reference algorithms on six out of the eight GLUE benchmark tasks. Moreover, our model outperforms MT-DNN by 0.77 on average on the overall task. Finally, we conducted a thorough empirical investigation to determine the optimal weight for each GLUE task.


2017 ◽  
Vol 54 (4) ◽  
pp. 475-488
Author(s):  
MATTHEW McKEEVER

AbstractIn this article, I argue that recent work in analytic philosophy on the semantics of names and the metaphysics of persistence supports two theses in Buddhist philosophy, namely the impermanence of objects and a corollary about how referential language works. According to this latter package of views, the various parts of what we call one object (say, King Milinda) possess no unity in and of themselves. Unity comes rather from language, in that we have terms (say, ‘King Milinda’) which stand for all the parts taken together. Objects are mind- (or rather language-)generated fictions. I think this package can be cashed out in terms of two central contemporary views. The first is that there are temporal parts: just as an object is spatially extended by having spatial parts at different spatial locations, so it is temporally extended by having temporal parts at different temporal locations. The second is that names are predicates: rather than standing for any one thing, a name stands for a range of things. The natural language term ‘Milinda’ is not akin to a logical constant, but akin to a predicate.Putting this together, I'll argue that names are predicates with temporal parts in their extension, which parts have no unity apart from falling under the same predicate. ‘Milinda’ is a predicate which has in its extension all Milinda's parts. The result is an interesting and original synthesis of plausible positions in semantics and metaphysics, which makes good sense of a central Buddhist doctrine.


Sign in / Sign up

Export Citation Format

Share Document