scholarly journals An AI Using Construction Grammar to Understand Text

Author(s):  
Denis Kiselev

This paper describes an AI that uses construction grammar (CG)—a means of knowledge representation for deep understanding of text. The proposed improvements aim at more versatility of the text form and meaning knowledge structure, as well as for intelligent choosing among possible parses. Along with the improvements, computational CG techniques that form the implementation basis are explained. Evaluation experiments utilize a Winograd schema (WS)—a major test for AI—dataset and compare the implementation with state-of-the-art ones. Results have demonstrated that compared with such techniques as deep learning, the proposed CG approach has a higher potential for the task of anaphora resolution involving deep understanding of the natural language.

2019 ◽  
Author(s):  
Negacy D. Hailu ◽  
Michael Bada ◽  
Asmelash Teka Hadgu ◽  
Lawrence E. Hunter

AbstractBackgroundthe automated identification of mentions of ontological concepts in natural language texts is a central task in biomedical information extraction. Despite more than a decade of effort, performance in this task remains below the level necessary for many applications.Resultsrecently, applications of deep learning in natural language processing have demonstrated striking improvements over previously state-of-the-art performance in many related natural language processing tasks. Here we demonstrate similarly striking performance improvements in recognizing biomedical ontology concepts in full text journal articles using deep learning techniques originally developed for machine translation. For example, our best performing system improves the performance of the previous state-of-the-art in recognizing terms in the Gene Ontology Biological Process hierarchy, from a previous best F1 score of 0.40 to an F1 of 0.70, nearly halving the error rate. Nearly all other ontologies show similar performance improvements.ConclusionsA two-stage concept recognition system, which is a conditional random field model for span detection followed by a deep neural sequence model for normalization, improves the state-of-the-art performance for biomedical concept recognition. Treating the biomedical concept normalization task as a sequence-to-sequence mapping task similar to neural machine translation improves performance.


2021 ◽  
Author(s):  
Oscar Nils Erik Kjell ◽  
H. Andrew Schwartz ◽  
Salvatore Giorgi

The language that individuals use for expressing themselves contains rich psychological information. Recent significant advances in Natural Language Processing (NLP) and Deep Learning (DL), namely transformers, have resulted in large performance gains in tasks related to understanding natural language such as machine translation. However, these state-of-the-art methods have not yet been made easily accessible for psychology researchers, nor designed to be optimal for human-level analyses. This tutorial introduces text (www.r-text.org), a new R-package for analyzing and visualizing human language using transformers, the latest techniques from NLP and DL. Text is both a modular solution for accessing state-of-the-art language models and an end-to-end solution catered for human-level analyses. Hence, text provides user-friendly functions tailored to test hypotheses in social sciences for both relatively small and large datasets. This tutorial describes useful methods for analyzing text, providing functions with reliable defaults that can be used off-the-shelf as well as providing a framework for the advanced users to build on for novel techniques and analysis pipelines. The reader learns about six methods: 1) textEmbed: to transform text to traditional or modern transformer-based word embeddings (i.e., numeric representations of words); 2) textTrain: to examine the relationships between text and numeric/categorical variables; 3) textSimilarity and 4) textSimilarityTest: to computing semantic similarity scores between texts and significance test the difference in meaning between two sets of texts; and 5) textProjection and 6) textProjectionPlot: to examine and visualize text within the embedding space according to latent or specified construct dimensions (e.g., low to high rating scale scores).


2021 ◽  
Vol 14 (5) ◽  
pp. 813-821
Author(s):  
Arif Usta ◽  
Akifhan Karakayali ◽  
Özgür Ulusoy

Translating Natural Language Queries (NLQs) to Structured Query Language (SQL) in interfaces deployed in relational databases is a challenging task, which has been widely studied in database community recently. Conventional rule based systems utilize series of solutions as a pipeline to deal with each step of this task, namely stop word filtering, tokenization, stemming/lemmatization, parsing, tagging, and translation. Recent works have mostly focused on the translation step overlooking the earlier steps by using adhoc solutions. In the pipeline, one of the most critical and challenging problems is keyword mapping; constructing a mapping between tokens in the query and relational database elements (tables, attributes, values, etc.). We define the keyword mapping problem as a sequence tagging problem, and propose a novel deep learning based supervised approach that utilizes POS tags of NLQs. Our proposed approach, called DBTagger (DataBase Tagger), is an end-to-end and schema independent solution, which makes it practical for various relational databases. We evaluate our approach on eight different datasets, and report new state-of-the-art accuracy results, 92.4% on the average. Our results also indicate that DBTagger is faster than its counterparts up to 10000 times and scalable for bigger databases.


2017 ◽  
Vol 24 (4) ◽  
pp. 813-821 ◽  
Author(s):  
Anne Cocos ◽  
Alexander G Fiks ◽  
Aaron J Masino

Abstract Objective Social media is an important pharmacovigilance data source for adverse drug reaction (ADR) identification. Human review of social media data is infeasible due to data quantity, thus natural language processing techniques are necessary. Social media includes informal vocabulary and irregular grammar, which challenge natural language processing methods. Our objective is to develop a scalable, deep-learning approach that exceeds state-of-the-art ADR detection performance in social media. Materials and Methods We developed a recurrent neural network (RNN) model that labels words in an input sequence with ADR membership tags. The only input features are word-embedding vectors, which can be formed through task-independent pretraining or during ADR detection training. Results Our best-performing RNN model used pretrained word embeddings created from a large, non–domain-specific Twitter dataset. It achieved an approximate match F-measure of 0.755 for ADR identification on the dataset, compared to 0.631 for a baseline lexicon system and 0.65 for the state-of-the-art conditional random field model. Feature analysis indicated that semantic information in pretrained word embeddings boosted sensitivity and, combined with contextual awareness captured in the RNN, precision. Discussion Our model required no task-specific feature engineering, suggesting generalizability to additional sequence-labeling tasks. Learning curve analysis showed that our model reached optimal performance with fewer training examples than the other models. Conclusions ADR detection performance in social media is significantly improved by using a contextually aware model and word embeddings formed from large, unlabeled datasets. The approach reduces manual data-labeling requirements and is scalable to large social media datasets.


Author(s):  
Shailaja Sampat

The ability of an agent to rationally answer questions about a given task is the key measure of its intelligence. While we have obtained phenomenal performance over various language and vision tasks separately, 'Technical, Hard and Explainable Question Answering' (THE-QA) is a new challenging corpus which addresses them jointly. THE-QA is a question answering task involving diagram understanding and reading comprehension. We plan to establish benchmarks over this new corpus using deep learning models guided by knowledge representation methods. The proposed approach will envisage detailed semantic parsing of technical figures and text, which is robust against diverse formats. It will be aided by knowledge acquisition and reasoning module that categorizes different knowledge types, identify sources to acquire that knowledge and perform reasoning to answer the questions correctly. THE-QA data will present a strong challenge to the community for future research and will bridge the gap between state-of-the-art Artificial Intelligence (AI) and 'Human-level' AI.


Author(s):  
Shaoxiang Chen ◽  
Ting Yao ◽  
Yu-Gang Jiang

Deep learning has achieved great successes in solving specific artificial intelligence problems recently. Substantial progresses are made on Computer Vision (CV) and Natural Language Processing (NLP). As a connection between the two worlds of vision and language, video captioning is the task of producing a natural-language utterance (usually a sentence) that describes the visual content of a video. The task is naturally decomposed into two sub-tasks. One is to encode a video via a thorough understanding and learn visual representation. The other is caption generation, which decodes the learned representation into a sequential sentence, word by word. In this survey, we first formulate the problem of video captioning, then review state-of-the-art methods categorized by their emphasis on vision or language, and followed by a summary of standard datasets and representative approaches. Finally, we highlight the challenges which are not yet fully understood in this task and present future research directions.


2020 ◽  
Vol 31 (7-8) ◽  
Author(s):  
Marina Paolanti ◽  
Rocco Pietrini ◽  
Adriano Mancini ◽  
Emanuele Frontoni ◽  
Primo Zingaretti

Abstract In retail environments, understanding how shoppers move about in a store’s spaces and interact with products is very valuable. While the retail environment has several favourable characteristics that support computer vision, such as reasonable lighting, the large number and diversity of products sold, as well as the potential ambiguity of shoppers’ movements, mean that accurately measuring shopper behaviour is still challenging. Over the past years, machine-learning and feature-based tools for people counting as well as interactions analytic and re-identification were developed with the aim of learning shopper skills based on occlusion-free RGB-D cameras in a top-view configuration. However, after moving into the era of multimedia big data, machine-learning approaches evolved into deep learning approaches, which are a more powerful and efficient way of dealing with the complexities of human behaviour. In this paper, a novel VRAI deep learning application that uses three convolutional neural networks to count the number of people passing or stopping in the camera area, perform top-view re-identification and measure shopper–shelf interactions from a single RGB-D video flow with near real-time performances has been introduced. The framework is evaluated on the following three new datasets that are publicly available: TVHeads for people counting, HaDa for shopper–shelf interactions and TVPR2 for people re-identification. The experimental results show that the proposed methods significantly outperform all competitive state-of-the-art methods (accuracy of 99.5% on people counting, 92.6% on interaction classification and 74.5% on re-id), bringing to different and significative insights for implicit and extensive shopper behaviour analysis for marketing applications.


Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5577
Author(s):  
Amos Azaria ◽  
Keren Nivasch

Intelligent agents that can interact with users using natural language are becoming increasingly common. Sometimes an intelligent agent may not correctly understand a user command or may not perform it properly. In such cases, the user might try a second time by giving the agent another, slightly different command. Giving an agent the ability to detect such user corrections might help it fix its own mistakes and avoid making them in the future. In this work, we consider the problem of automatically detecting user corrections using deep learning. We develop a multimodal architecture called SAIF, which detects such user corrections, taking as inputs the user’s voice commands as well as their transcripts. Voice inputs allow SAIF to take advantage of sound cues, such as tone, speed, and word emphasis. In addition to sound cues, our model uses transcripts to determine whether a command is a correction to the previous command. Our model also obtains internal input from the agent, indicating whether the previous command was executed successfully or not. Finally, we release a unique dataset in which users interacted with an intelligent agent assistant, by giving it commands. This dataset includes labels on pairs of consecutive commands, which indicate whether the latter command is in fact a correction of the former command. We show that SAIF outperforms current state-of-the-art methods on this dataset.


Sign in / Sign up

Export Citation Format

Share Document