Computational Analysis of Storylines

Mapping Intimacies ◽

10.1017/9781108854221 ◽

2021 ◽

Keyword(s):

Computational Linguistics ◽

Language Processing ◽

Computational Analysis ◽

State Of The Art ◽

Relevant Information ◽

Event Extraction ◽

Multidisciplinary Research ◽

Narrative Structures ◽

Current State ◽

Event Representations

Event structures are central in Linguistics and Artificial Intelligence research: people can easily refer to changes in the world, identify their participants, distinguish relevant information, and have expectations of what can happen next. Part of this process is based on mechanisms similar to narratives, which are at the heart of information sharing. But it remains difficult to automatically detect events or automatically construct stories from such event representations. This book explores how to handle today's massive news streams and provides multidimensional, multimodal, and distributed approaches, like automated deep learning, to capture events and narrative structures involved in a 'story'. This overview of the current state-of-the-art on event extraction, temporal and casual relations, and storyline extraction aims to establish a new multidisciplinary research community with a common terminology and research agenda. Graduate students and researchers in natural language processing, computational linguistics, and media studies will benefit from this book.

Download Full-text

An Overview of Biomolecular Event Extraction from Scientific Documents

Computational and Mathematical Methods in Medicine ◽

10.1155/2015/571381 ◽

2015 ◽

Vol 2015 ◽

pp. 1-19 ◽

Cited By ~ 2

Author(s):

Jorge A. Vanegas ◽

Sérgio Matos ◽

Fabio González ◽

José L. Oliveira

Keyword(s):

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Biomedical Literature ◽

Event Extraction ◽

Automatic Extraction ◽

Biological Processes ◽

Scientific Texts ◽

Research Areas ◽

Current State

This paper presents a review of state-of-the-art approaches to automatic extraction of biomolecular events from scientific texts. Events involving biomolecules such as genes, transcription factors, or enzymes, for example, have a central role in biological processes and functions and provide valuable information for describing physiological and pathogenesis mechanisms. Event extraction from biomedical literature has a broad range of applications, including support for information retrieval, knowledge summarization, and information extraction and discovery. However, automatic event extraction is a challenging task due to the ambiguity and diversity of natural language and higher-level linguistic phenomena, such as speculations and negations, which occur in biological texts and can lead to misunderstanding or incorrect interpretation. Many strategies have been proposed in the last decade, originating from different research areas such as natural language processing, machine learning, and statistics. This review summarizes the most representative approaches in biomolecular event extraction and presents an analysis of the current state of the art and of commonly used methods, features, and tools. Finally, current research trends and future perspectives are also discussed.

Download Full-text

Title-Guided Encoding for Keyphrase Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016268 ◽

2019 ◽

Vol 33 ◽

pp. 6268-6275 ◽

Cited By ~ 1

Author(s):

Wang Chen ◽

Yifan Gao ◽

Jiani Zhang ◽

Irwin King ◽

Michael R. Lyu

Keyword(s):

Language Processing ◽

Deep Neural Networks ◽

State Of The Art ◽

Relevant Information ◽

The State ◽

Main Body ◽

Generation Task ◽

Leading Role ◽

Very High

Keyphrase generation (KG) aims to generate a set of keyphrases given a document, which is a fundamental task in natural language processing (NLP). Most previous methods solve this problem in an extractive manner, while recently, several attempts are made under the generative setting using deep neural networks. However, the state-of-the-art generative methods simply treat the document title and the document main body equally, ignoring the leading role of the title to the overall document. To solve this problem, we introduce a new model called Title-Guided Network (TG-Net) for automatic keyphrase generation task based on the encoderdecoder architecture with two new features: (i) the title is additionally employed as a query-like input, and (ii) a titleguided encoder gathers the relevant information from the title to each word in the document. Experiments on a range of KG datasets demonstrate that our model outperforms the state-of-the-art models with a large margin, especially for documents with either very low or very high title length ratios.

Download Full-text

Training a neural network to learn other dimensionality reduction removes data size restrictions in bioinformatics and provides a new route to exploring data representations

10.1101/2020.09.03.269555 ◽

2020 ◽

Cited By ~ 1

Author(s):

Alex Dexter ◽

Spencer A. Thomas ◽

Rory T. Steven ◽

Kenneth N. Robinson ◽

Adam J. Taylor ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Dimensionality Reduction ◽

Computational Analysis ◽

New Technologies ◽

State Of The Art ◽

Current State ◽

Data Representations ◽

Non Linear ◽

Linear Dimensionality Reduction

AbstractHigh dimensionality omics and hyperspectral imaging datasets present difficult challenges for feature extraction and data mining due to huge numbers of features that cannot be simultaneously examined. The sample numbers and variables of these methods are constantly growing as new technologies are developed, and computational analysis needs to evolve to keep up with growing demand. Current state of the art algorithms can handle some routine datasets but struggle when datasets grow above a certain size. We present a training deep learning via neural networks on non-linear dimensionality reduction, in particular t-distributed stochastic neighbour embedding (t-SNE), to overcome prior limitations of these methods.One Sentence SummaryAnalysis of prohibitively large datasets by combining deep learning via neural networks with non-linear dimensionality reduction.

Download Full-text

Computational Linguistics in Support of Linguistic Theory

Linguistic Issues in Language Technology ◽

10.33011/lilt.v3i.1213 ◽

2010 ◽

Vol 3 ◽

Author(s):

Emily M. Bender ◽

D. Terence Langendoen

Keyword(s):

Computational Methods ◽

Computational Linguistics ◽

State Of The Art ◽

Theory Development ◽

Linguistic Theory ◽

Linguistic Data ◽

Current State

In this paper, we overview the ways in which computational methods can serve the goals of analysis and theory development in linguistics, and encourage the reader to become involved in the emerging cyberinfrastructure for linguistics. We survey examples from diverse subfields of how computational methods are already being used, describe the current state of the art in cyberinfrastructure for linguistics, sketch a pie-in-the-sky view of where the field could go, and outline steps that linguists can take now to bring about better access to and use of linguistic data through cyberinfrastructure.

Download Full-text

Computational Linguistics

10.1093/oxfordhb/9780198736745.013.19 ◽

2018 ◽

Author(s):

Karine Megerdoomian

Keyword(s):

Computational Linguistics ◽

Language Processing ◽

Computational Modelling ◽

Formal Approach ◽

Future Directions ◽

Light Verb Constructions ◽

Light Verb ◽

Current State ◽

Linguistic Representations ◽

Verb Constructions

This chapter introduces the fields of Computational Linguistics (CL)—the computational modelling of linguistic representations and theories—and Natural Language Processing (NLP)—the design and implementation of tools for automated language understanding and production—and discusses some of the existing tensions between the formal approach to linguistics and the current state of the research and development in CL and NLP. The paper goes on to explain the specific challenges faced by CL and NLP for Persian, much of it derived from the intricacies presented by the Perso-Arabic script in automatically identifying word and phrase boundaries in text, as well as difficulties in automatic processing of compound words and light verb constructions. The chapter then provides an overview of the state of the art in current and recent CL and NLP for Persian. It concludes with areas for improvement and suggestions for future directions.

Download Full-text

A System for Extracting Study Design Parameters from Nutritional Genomics Abstracts

Journal of Integrative Bioinformatics ◽

10.1515/jib-2013-222 ◽

2013 ◽

Vol 10 (2) ◽

pp. 82-93 ◽

Cited By ~ 2

Author(s):

Cassidy Kelly ◽

Hui Yang

Keyword(s):

Language Processing ◽

Study Design ◽

State Of The Art ◽

Design Parameters ◽

Regular Expressions ◽

Journal Articles ◽

Current State ◽

Novel Approach ◽

Algorithmic Framework ◽

Nutritional Genomics

Summary The extraction of study design parameters from biomedical journal articles is an important problem in natural language processing (NLP). Such parameters define the characteristics of a study, such as the duration, the number of subjects, and their profile. Here we present a system for extracting study design parameters from sentences in article abstracts. This system will be used as a component of a larger system for creating nutrigenomics networks from articles in the nutritional genomics domain. The algorithms presented consist of manually designed rules expressed either as regular expressions or in terms of sentence parse structure. A number of filters and NLP tools are also utilized within a pipelined algorithmic framework. Using this novel approach, our system performs extraction at a finer level of granularity than comparable systems, while generating results that surpass the current state of the art.

Download Full-text

Natural language interfaces to databases – an introduction

Natural Language Engineering ◽

10.1017/s135132490000005x ◽

1995 ◽

Vol 1 (1) ◽

pp. 29-81 ◽

Cited By ~ 283

Author(s):

I. Androutsopoulos ◽

G.D. Ritchie ◽

P. Thanisch

Keyword(s):

Natural Language ◽

Computational Linguistics ◽

State Of The Art ◽

Query Languages ◽

Natural Language Interfaces ◽

Advantages And Disadvantages ◽

Current State ◽

Database Updates ◽

Graphical Interfaces ◽

History Of

AbstractThis paper is an introduction to natural language interfaces to databases (NLIDBS). A brief overview of the history of NLIDBS is first given. Some advantages and disadvantages of NLIDBS are then discussed, comparing NLIDBS to formal query languages, form-based interfaces, and graphical interfaces. An introduction to some of the linguistic problems NLIDBS have to confront follows, for the benefit of readers less familiar with computational linguistics. The discussion then moves on to NLIDB architectures, portability issues, restricted natural language input systems (including menu-based NLIDBS), and NLIDBS with reasoning capabilities. Some less explored areas of NLIDB research are then presented, namely database updates, meta-knowledge questions, temporal questions, and multi-modal NLIDBS. The paper ends with reflections on the current state of the art.

Download Full-text

The Semantics and Collocations Relation in Food Reviews

The International FLAIRS Conference Proceedings ◽

10.32473/flairs.v34i1.128372 ◽

2021 ◽

Vol 34 (1) ◽

Author(s):

Fazel Keshtkar ◽

Ledong Shi ◽

Syed Ahmad Chan Bukhari

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computational Linguistics ◽

Language Processing ◽

Topic Modeling ◽

State Of The Art ◽

Semantic Relation ◽

The Other ◽

Good Place ◽

The Common

Finding our favorite dishes have became a hard task since restaurants are providing more choices and va- rieties. On the other hand, comments and reviews of restaurants are a good place to look for the answer. The purpose of this study is to use computational linguistics and natural language processing to categorise and find semantic relation in various dishes based on reviewers’ comments and menus description. Our goal is to imple- ment a state-of-the-art computational linguistics meth- ods such as, word embedding model, word2vec, topic modeling, PCA, classification algorithm. For visualiza- tions, t-Distributed Stochastic Neighbor Embedding (t- SNE) was used to explore the relation within dishes and their reviews. We also aim to extract the common pat- terns between different dishes among restaurants and reviews comment, and in reverse, explore the dishes with a semantics relations. A dataset of articles related to restaurant and located dishes within articles used to find comment patterns. Then we applied t-SNE visual- izations to identify the root of each feature of the dishes. As a result, to find a dish our model is able to assist users by several words of description and their inter- est. Our dataset contains 1,000 articles from food re- views agency on a variety of dishes from different cul- tures: American, i.e. ’steak’, hamburger; Chinese, i.e. ’stir fry’, ’dumplings’; Japanese, i.e., ’sushi’.

Download Full-text

Natural language interfaces to databases

The Knowledge Engineering Review ◽

10.1017/s0269888900005476 ◽

1990 ◽

Vol 5 (4) ◽

pp. 225-249 ◽

Cited By ~ 52

Author(s):

Ann Copestake ◽

Karen Sparck Jones

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

State Of The Art ◽

Central Process ◽

Current State ◽

Natural Language Question ◽

The One ◽

Language Question ◽

And Task

AbstractThis paper reviews the current state of the art in natural language access to databases. This has been a long-standing area of work in natural language processing. But though some commercial systems are now available, providing front ends has proved much harder than was expected, and the necessary limitations on front ends have to be recognized. The paper discusses the issues, both general to language and task-specific, involved in front end design, and the way these have been addressed, concentrating on the work of the last decade. The focus is on the central process of translating a natural language question into a database query, but other supporting functions are also covered. The points are illustrated by the use of a single example application. The paper concludes with an evaluation of the current state, indicating that future progress will depend on the one hand on general advances in natural language processing, and on the other on expanding the capabilities of traditional databases.

Download Full-text

Applications of term identification technology: domain description and content characterisation

Natural Language Engineering ◽

10.1017/s1351324999002090 ◽

1999 ◽

Vol 5 (1) ◽

pp. 17-44 ◽

Cited By ~ 10

Author(s):

BRANIMIR BOGURAEV ◽

CHRISTOPHER KENNEDY

Keyword(s):

Language Processing ◽

State Of The Art ◽

Operational Environment ◽

Text Indexing ◽

Language Engineering ◽

Current State ◽

Domain Description ◽

Technology Domain ◽

Term Identification ◽

Technical Terms

The identification and extraction of technical terms is one of the better understood and most robust Natural Language Processing (NLP) technologies within the current state of the art of language engineering. In generic information management contexts, terms have been used primarily for procedures seeking to identify a set of phrases that is useful for tasks such as text indexing, computational lexicology, and machine-assisted translation: such tasks make important use of the assumption that terminology is representative of a given domain. This paper discusses an extension of basic terminology identification technology for the application to two higher level semantic tasks: domain description, the specification of the technical domain of a document, and content characterisation, the construction of a compact, coherent and useful representation of the topical content of a text. With these extensions, terminology identification becomes the foundation of an operational environment for document processing and content abstraction.

Download Full-text