scholarly journals Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

Author(s):  
Trapit Bansal ◽  
Rishikesh Jha ◽  
Andrew McCallum
2020 ◽  
Author(s):  
Trapit Bansal ◽  
Rishikesh Jha ◽  
Tsendsuren Munkhdalai ◽  
Andrew McCallum

2019 ◽  
Vol 26 (11) ◽  
pp. 1272-1278 ◽  
Author(s):  
Dmitriy Dligach ◽  
Majid Afshar ◽  
Timothy Miller

Abstract Objective Our objective is to develop algorithms for encoding clinical text into representations that can be used for a variety of phenotyping tasks. Materials and Methods Obtaining large datasets to take advantage of highly expressive deep learning methods is difficult in clinical natural language processing (NLP). We address this difficulty by pretraining a clinical text encoder on billing code data, which is typically available in abundance. We explore several neural encoder architectures and deploy the text representations obtained from these encoders in the context of clinical text classification tasks. While our ultimate goal is learning a universal clinical text encoder, we also experiment with training a phenotype-specific encoder. A universal encoder would be more practical, but a phenotype-specific encoder could perform better for a specific task. Results We successfully train several clinical text encoders, establish a new state-of-the-art on comorbidity data, and observe good performance gains on substance misuse data. Discussion We find that pretraining using billing codes is a promising research direction. The representations generated by this type of pretraining have universal properties, as they are highly beneficial for many phenotyping tasks. Phenotype-specific pretraining is a viable route for trading the generality of the pretrained encoder for better performance on a specific phenotyping task. Conclusions We successfully applied our approach to many phenotyping tasks. We conclude by discussing potential limitations of our approach.


2021 ◽  
Author(s):  
Hojae Han ◽  
Seungtaek Choi ◽  
Myeongho Jeong ◽  
Jin-woo Park ◽  
Seung-won Hwang

Author(s):  
Santhi Selvaraj ◽  
Raja Sekar J. ◽  
Amutha S.

The main objective is to recognize the chat from social media as spoken language by using deep belief network (DBN). Currently, language classification is one of the main applications of natural language processing, artificial intelligence, and deep learning. Language classification is the process of ascertaining the information being presented in which natural language and recognizing a language from the audio sound. Presently, most language recognition systems are based on hidden Markov models and Gaussian mixture models that support both acoustic and sequential modeling. This chapter presents a DBN-based recognition system in three different languages, namely English, Hindi, and Tamil. The evaluation of languages is performed on the self built recorded database, which extracts the mel-frequency cepstral coefficients features from the speeches. These features are fed into the DBN with a back propagation learning algorithm for the recognition process. Accuracy of the recognition is efficient for the chosen languages and the system performance is assessed on three different languages.


10.29007/8h3z ◽  
2018 ◽  
Author(s):  
Sai Prabhakar Pandi Selvaraj ◽  
Manuela Veloso ◽  
Stephanie Rosenthal

Advances in state-of-the-art techniques including convolutional neural networks (CNNs) have led to improved perception in autonomous robots. However, these new techniques make a robot’s decision-making process obscure even for the experts. Our goal is to auto- matically generate natural language explanations of a robot’s perception-based inferences in order to help people understand what features contribute to these classification predic- tions. Generating natural language explanations is particularly challenging for perception and other high-dimension classification tasks because 1) we lack a mapping from features to language and 2) there are a large number of features which could be explained. We present a novel approach to generating explanations, which first finds the important features that most affect the classification prediction and then utilizes a secondary detector which can identify and label multiple parts of the features, to label only those important features. Those labels serve as the natural language groundings that we use in our explanations. We demonstrate our explanation algorithm’s ability on the floor identification classifier of our mobile service robot.


AI Magazine ◽  
2011 ◽  
Vol 32 (2) ◽  
pp. 42 ◽  
Author(s):  
Anton Leuski ◽  
David Traum

NPCEditor is a system for building a natural language processing component for virtual humans capable of engaging a user in spoken dialog on a limited domain. It uses statistical language classification technology for mapping from a user’s text input to system responses. NPCEditor provides a user-friendly editor for creating effective virtual humans quickly. It has been deployed as a part of various virtual human systems in several applications.


2011 ◽  
Vol 37 (4) ◽  
pp. 689-698 ◽  
Author(s):  
Simon J. Greenhill

The Levenshtein distance is a simple distance metric derived from the number of edit operations needed to transform one string into another. This metric has received recent attention as a means of automatically classifying languages into genealogical subgroups. In this article I test the performance of the Levenshtein distance for classifying languages by subsampling three language subsets from a large database of Austronesian languages. Comparing the classification proposed by the Levenshtein distance to that of the comparative method shows that the Levenshtein classification is correct only 40% of time. Standardizing the orthography increases the performance, but only to a maximum of 65% accuracy within language subgroups. The accuracy of the Levenshtein classification decreases rapidly with phylogenetic distance, failing to discriminate homology and chance similarity across distantly related languages.This poor performance suggests the need for more linguistically nuanced methods for automated language classification tasks.


2021 ◽  
Vol 24 (2) ◽  
pp. 1740-1747
Author(s):  
Anton Leuski ◽  
David Traum

NPCEditor is a system for building a natural language processing component for virtual humans capable of engaging a user in spoken dialog on a limited domain. It uses a statistical language classification technology for mapping from user's text input to system responses. NPCEditor provides a user-friendly editor for creating effective virtual humans quickly. It has been deployed as a part of various virtual human systems in several applications.


Sign in / Sign up

Export Citation Format

Share Document