IDL-Expressions: A Formalism for Representing and Parsing Finite Languages in Natural Language Processing

Journal of Artificial Intelligence Research ◽

10.1613/jair.1309 ◽

2004 ◽

Vol 21 ◽

pp. 287-317 ◽

Cited By ~ 1

Author(s):

M. J. Nederhof ◽

G. Satta

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Upper Bound ◽

Time Complexity ◽

Large Set ◽

New Formalism ◽

Language Generation ◽

Parsing Algorithm ◽

Formal Properties

We propose a formalism for representation of finite languages, referred to as the class of IDL-expressions, which combines concepts that were only considered in isolation in existing formalisms. The suggested applications are in natural language processing, more specifically in surface natural language generation and in machine translation, where a sentence is obtained by first generating a large set of candidate sentences, represented in a compact way, and then by filtering such a set through a parser. We study several formal properties of IDL-expressions and compare this new formalism with more standard ones. We also present a novel parsing algorithm for IDL-expressions and prove a non-trivial upper bound on its time complexity.

Download Full-text

Guru

Cross-Disciplinary Advances in Applied Natural Language Processing ◽

10.4018/978-1-61350-447-5.ch011 ◽

2012 ◽

pp. 156-171 ◽

Cited By ~ 8

Author(s):

Andrew M. Olney ◽

Natalie K. Person ◽

Arthur C. Graesser

Keyword(s):

Natural Language Processing ◽

Knowledge Representation ◽

Natural Language ◽

Language Processing ◽

Natural Language Understanding ◽

Natural Language Generation ◽

Language Understanding ◽

Language Generation ◽

Processing Techniques

The authors discuss Guru, a conversational expert ITS. Guru is designed to mimic expert human tutors using advanced applied natural language processing techniques including natural language understanding, knowledge representation, and natural language generation.

Download Full-text

Extractive Based Single Document Text Summarization Using Clustering Approach

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v3.i2.pp73-78 ◽

2014 ◽

Vol 3 (2) ◽

pp. 73

Author(s):

Pankaj Kailas Bhole ◽

A. J. Agrawal

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Computational Intelligence ◽

Clustering Algorithm ◽

Text Summarization ◽

Large Set ◽

Meaningful Information ◽

Time Reading ◽

Clustering Approach

Text summarization is an old challenge in text mining but in dire need of researcher’s attention in the areas of computational intelligence, machine learning and natural language processing. We extract a set of features from each sentence that helps identify its importance in the document. Every time reading full text is time consuming. Clustering approach is useful to decide which type of data present in document. In this paper we introduce the concept of k-mean clustering for natural language processing of text for word matching and in order to extract meaningful information from large set of offline documents, data mining document clustering algorithm are adopted.

Download Full-text

Social Media and Political Campaigning

The International Journal of Press/Politics ◽

10.1177/1940161216673196 ◽

2016 ◽

Vol 22 (1) ◽

pp. 23-42 ◽

Cited By ~ 16

Author(s):

Michael J. Jensen

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Public Figure ◽

Political Campaigning ◽

The Media ◽

Formal Properties ◽

Political Figure ◽

Communication Operations

This paper develops a way for analyzing the structure of campaign communications within Twitter. The structure of communication affordances creates opportunities for a horizontal organization power within Twitter interactions. However, one cannot infer the structure of interactions as they materialize from the formal properties of the technical environment in which the communications occur. Consequently, the paper identifies three categories of empowering communication operations that can occur on Twitter: Campaigns can respond to others, campaigns can retweet others, and campaigns can call for others to become involved in the campaign on their own terms. The paper operationalizes these categories in the context of the 2015 U.K. general election. To determine whether Twitter is used to empower laypersons, the profiles of each account retweeted and replied to were retrieved and analyzed using natural language processing to identify whether an account is from a political figure, member of the media, or some other public figure. In addition, tweets and retweets are compared with respect to the manner key election issues are discussed. The findings indicate that empowering uses of Twitter are fairly marginal, and retweets use almost identical policy language as the original campaign tweets.

Download Full-text

Predicting Verification Methods from Natural Language Requirements

10.31224/osf.io/wxv9e ◽

2020 ◽

Author(s):

Michael Prendergast

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Analysis ◽

Nearest Neighbor ◽

Large Set ◽

Technical Requirements ◽

Verification Methods ◽

Natural Language Requirements ◽

Processing Algorithms

Abstract – A Verification Cross-Reference Matrix (VCRM) is a table that depicts the verification methods for requirements in a specification. Usually requirement labels are rows, available test methods are columns, and an “X” in a cell indicates usage of a verification method for that requirement. Verification methods include Demonstration, Inspection, Analysis and Test, and sometimes Certification, Similarity and/or Analogy. VCRMs enable acquirers and stakeholders to quickly understand how a product’s requirements will be tested.Maintaining consistency of very large VCRMs can be challenging, and inconsistent verification methods can result in a large set of uncoordinated “spaghetti tests”. Natural language processing algorithms that can identify similarities between requirements offer promise in addressing this challenge.This paper applies and compares compares four natural language processing algorithms to the problem of automatically populating VCRMs from natural language requirements: Naïve Bayesian inference, (b) Nearest Neighbor by weighted Dice similarity, (c) Nearest Neighbor with Latent Semantic Analysis similarity, and (d) an ensemble method combining the first three approaches. The VCRMs used for this study are for slot machine technical requirements derived from gaming regulations from the countries of Australia and New Zealand, the province of Nova Scotia (Canada), the state of Michigan (United States) and recommendations from the International Association of Gaming Regulators (IAGR).

Download Full-text

Associating Automatic Natural Language Processing to Serious Games and Virtual Worlds

Journal of Virtual Worlds Research ◽

10.4101/jvwr.v4i3.6124 ◽

2011 ◽

Vol 4 (3) ◽

Cited By ~ 2

Author(s):

Treveur Bretaudière ◽

Samuel Cruz-Lara ◽

Lina María Rojas Barahona

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Learning ◽

Language Processing ◽

Virtual Worlds ◽

Serious Games ◽

Emotion Detection ◽

Language Generation ◽

Research Activities ◽

Information Framework

We present our current research activities associating automatic natural language processing to serious games and virtual worlds. Several interesting scenarios have been developed: language learning, natural language generation, multilingual information, emotion detection, real-time translations, and non-intrusive access to linguistic information such as definitions or synonyms. Part of our work has contributed to the specification of the Multi Lingual Information Framework [ISO FDIS 24616], (MLIF,2011). Standardization will grant stability, interoperability and sustainability of an important part of our research activities, in particular, in the framework of representing and managing multilingual textual information.

Download Full-text

A Voice Calculator Based on Speech Error Correction

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.846-847.1239 ◽

2013 ◽

Vol 846-847 ◽

pp. 1239-1242

Author(s):

Yang Yang ◽

Hui Zhang ◽

Yong Qi Wang

Keyword(s):

Natural Language Processing ◽

Numerical Calculation ◽

Speech Recognition ◽

Natural Language ◽

Error Correction ◽

Recent Work ◽

Language Processing ◽

Speech Error ◽

Parsing Algorithm ◽

Recognition Errors

This paper presents our recent work towards the development of a voice calculator based on speech error correction and natural language processing. The calculator enhances the accuracy of speech recognition by classifying and summarizing recognition errors on numerical calculation speech recognition area, then constructing Pinyin-text-mapping library and replacement rules, and combing priority correction mechanism and memory correction mechanism of Pinyin-text-mapping. For the expression after correctly recognizing, the calculator uses recursive-descent parsing algorithm and synthesized attribute computing algorithm to calculate the final result and output the result using TTS engine. The implementation of this voice calculator makes a calculator more humane and intelligent.

Download Full-text

Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing

Global Spine Journal ◽

10.1177/21925682211026910 ◽

2021 ◽

pp. 219256822110269

Author(s):

Fabio Galbusera ◽

Andrea Cina ◽

Tito Bassani ◽

Matteo Panico ◽

Luca Maria Sconfienza

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Natural Language Processing ◽

Lumbar Spine ◽

Natural Language ◽

Language Processing ◽

Large Set ◽

Learning Models ◽

Radiological Findings ◽

Radiographic Images

Study Design: Retrospective study. Objectives: Huge amounts of images and medical reports are being generated in radiology departments. While these datasets can potentially be employed to train artificial intelligence tools to detect findings on radiological images, the unstructured nature of the reports limits the accessibility of information. In this study, we tested if natural language processing (NLP) can be useful to generate training data for deep learning models analyzing planar radiographs of the lumbar spine. Methods: NLP classifiers based on the Bidirectional Encoder Representations from Transformers (BERT) model able to extract structured information from radiological reports were developed and used to generate annotations for a large set of radiographic images of the lumbar spine (N = 10 287). Deep learning (ResNet-18) models aimed at detecting radiological findings directly from the images were then trained and tested on a set of 204 human-annotated images. Results: The NLP models had accuracies between 0.88 and 0.98 and specificities between 0.84 and 0.99; 7 out of 12 radiological findings had sensitivity >0.90. The ResNet-18 models showed performances dependent on the specific radiological findings with sensitivities and specificities between 0.53 and 0.93. Conclusions: NLP generates valuable data to train deep learning models able to detect radiological findings in spine images. Despite the noisy nature of reports and NLP predictions, this approach effectively mitigates the difficulties associated with the manual annotation of large quantities of data and opens the way to the era of big data for artificial intelligence in musculoskeletal radiology.

Download Full-text

Natural Language Processing: Components, Advances, Tools and Industrial Applications

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38015 ◽

2021 ◽

Vol 9 (9) ◽

pp. 964-971

Author(s):

Hima Yeldo

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Question Answering ◽

Industrial Applications ◽

Spam Detection ◽

Language Understanding ◽

Language Generation ◽

Email Spam

Abstract: Natural Language Processing is the study that focuses the interplay between computer and the human languages NLP has spread its applications in various fields such as an email Spam detection, machine translation, summation, information extraction, and question answering etc. Natural Language Processing classifies two parts i.e. Natural Language Generation and Natural Language understanding which evolves the task to generate and understand the text.

Download Full-text

Building Natural Language Generation Systems Ehud Reiter and Robert Dale (University of Aberdeen and Macquarie University) Cambridge University Press (Studies in natural language processing), 2000, xxi+248 pp; hardbound, ISBN 0-521-62036-8, $59.95

Computational Linguistics ◽

10.1162/coli.2000.27.2.298 ◽

2001 ◽

Vol 27 (2) ◽

pp. 298-300

Author(s):

Helmut Horacek

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Natural Language Generation ◽

Cambridge University ◽

Language Generation

Download Full-text

Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records

Biomedical Informatics Insights ◽

10.4137/bii.s38916 ◽

2016 ◽

Vol 8 ◽

pp. BII.S38916 ◽

Cited By ~ 2

Author(s):

Yuan Luo ◽

Peter Szolovits

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Electronic Medical Records ◽

Language Processing ◽

Medical Records ◽

Time Complexity ◽

Complexity Analysis ◽

Fundamental Problem ◽

Basic Interval ◽

Query Algorithm

In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.

Download Full-text