Time, tense and aspect in natural language database interfaces

Most existing Natural Language Database Interfaces (NLDB) were designed to be used with database systems that provide very limited facilities for manipulating time-dependent data, and they do not support adequately temporal linguistic mechanisms (verb tenses, temporal adverbials, temporal subordinate clauses, etc.). The database community is becoming increasingly interested in temporal database systems, which are intended to store and manipulate in a principled manner information not only about the present, but also about the past and future. When interfacing to temporal databases, supporting temporal linguistic mechanisms becomes crucial.We present a framework for constructing Natural Language Interfaces for Temporal Databases (NLTDB), which draws on research in tense and aspect theories, temporal logics and temporal databases. The framework consists of a temporal intermediate representation language, called TOP, an HPSG grammar that maps a wide range of questions involving temporal mechanisms to appropriate TOP expressions, and a provably correct method for translating from TOP to TSQL2, TSQL2 being a recently proposed temporal extension of the SQL database language. This framework was employed to implement a prototype NLTDB.

Download Full-text

Mood and modality: out of theory and into the fray

Natural Language Engineering ◽

10.1017/s1351324903003279 ◽

2004 ◽

Vol 10 (1) ◽

pp. 57-89 ◽

Cited By ~ 2

Author(s):

MARJORIE MCSHANE ◽

SERGEI NIRENBURG ◽

RON ZACHARSKI

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Translation System ◽

Free Standing ◽

Indicative Conditional ◽

Tense And Aspect ◽

Language L ◽

Wide Range ◽

Value Sets

The topic of mood and modality (MOD) is a difficult aspect of language description because, among other reasons, the inventory of modal meanings is not stable across languages, moods do not map neatly from one language to another, modality may be realised morphologically or by free-standing words, and modality interacts in complex ways with other modules of the grammar, like tense and aspect. Describing MOD is especially difficult if one attempts to develop a unified approach that not only provides cross-linguistic coverage, but is also useful in practical natural language processing systems. This article discusses an approach to MOD that was developed for and implemented in the Boas Knowledge-Elicitation (KE) system. Boas elicits knowledge about any language, L, from an informant who need not be a trained linguist. That knowledge then serves as the static resources for an L-to-English translation system. The KE methodology used throughout Boas is driven by a resident inventory of parameters, value sets, and means of their realisation for a wide range of language phenomena. MOD is one of those parameters, whose values are the inventory of attested and not yet attested moods (e.g. indicative, conditional, imperative), and whose realisations include flective morphology, agglutinating morphology, isolating morphology, words, phrases and constructions. Developing the MOD elicitation procedures for Boas amounted to wedding the extensive theoretical and descriptive research on MOD with practical approaches to guiding an untrained informant through this non-trivial task. We believe that our experience in building the MOD module of Boas offers insights not only into cross-linguistic aspects of MOD that have not previously been detailed in the natural language processing literature, but also into KE methodologies that could be applied more broadly.

Download Full-text

A FORMALISM FOR REPRESENTING AND REASONING WITH LINGUISTIC INFORMATION

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s021848850200148x ◽

2002 ◽

Vol 10 (03) ◽

pp. 281-307 ◽

Cited By ~ 4

Author(s):

TRU H. CAO

Keyword(s):

Artificial Intelligence ◽

Fuzzy Logic ◽

Natural Language ◽

Functional Relation ◽

Computing With Words ◽

Conceptual Graphs ◽

Graph Language ◽

Conceptual Graph ◽

Knowledge Representation Language ◽

Representation Language

Conceptual graphs and fuzzy logic are two logical formalisms that emphasize the target of natural language, where conceptual graphs provide a structure of formulas close to that of natural language sentences while fuzzy logic provides a methodology for computing with words. This paper proposes fuzzy conceptual graphs as a knowledge representation language that combines the advantages of both the two formalisms for artificial intelligence approaching human expression and reasoning. Firstly, the conceptual graph language is extended with functional relation types for representing functional dependency, and conjunctive types for joining concepts and relations. Then fuzzy conceptual graphs are formulated as a generalization of conceptual graphs where fuzzy types and fuzzy attribute-values are used in place of crisp types and crisp attribute-values. Projection and join as basic operations for reasoning on fuzzy conceptual graphs are defined, taking into account the semantics of fuzzy set-based values.

Download Full-text

Adaptive intelligent learning approach based on visual anti-spam email model for multi-natural language

Journal of Intelligent Systems ◽

10.1515/jisys-2021-0045 ◽

2021 ◽

Vol 30 (1) ◽

pp. 774-792

Author(s):

Mazin Abed Mohammed ◽

Dheyaa Ahmed Ibrahim ◽

Akbal Omran Salman

Keyword(s):

Natural Language ◽

Naive Bayes ◽

False Negative ◽

Naïve Bayes ◽

Final Decision ◽

Learning Approach ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Wide Range

Abstract Spam electronic mails (emails) refer to harmful and unwanted commercial emails sent to corporate bodies or individuals to cause harm. Even though such mails are often used for advertising services and products, they sometimes contain links to malware or phishing hosting websites through which private information can be stolen. This study shows how the adaptive intelligent learning approach, based on the visual anti-spam model for multi-natural language, can be used to detect abnormal situations effectively. The application of this approach is for spam filtering. With adaptive intelligent learning, high performance is achieved alongside a low false detection rate. There are three main phases through which the approach functions intelligently to ascertain if an email is legitimate based on the knowledge that has been gathered previously during the course of training. The proposed approach includes two models to identify the phishing emails. The first model has proposed to identify the type of the language. New trainable model based on Naive Bayes classifier has also been proposed. The proposed model is trained on three types of languages (Arabic, English and Chinese) and the trained model has used to identify the language type and use the label for the next model. The second model has been built by using two classes (phishing and normal email for each language) as a training data. The second trained model (Naive Bayes classifier) has been applied to identify the phishing emails as a final decision for the proposed approach. The proposed strategy is implemented using the Java environments and JADE agent platform. The testing of the performance of the AIA learning model involved the use of a dataset that is made up of 2,000 emails, and the results proved the efficiency of the model in accurately detecting and filtering a wide range of spam emails. The results of our study suggest that the Naive Bayes classifier performed ideally when tested on a database that has the biggest estimate (having a general accuracy of 98.4%, false positive rate of 0.08%, and false negative rate of 2.90%). This indicates that our Naive Bayes classifier algorithm will work viably on the off chance, connected to a real-world database, which is more common but not the largest.

Download Full-text

On some arguable claims in B. Shneiderman's evaluation of natural language interaction with database systems

ACM SIGMOD Record ◽

10.1145/984514.984520 ◽

1982 ◽

Vol 13 (1) ◽

pp. 92-97 ◽

Cited By ~ 3

Author(s):

Neil C. Rowe

Keyword(s):

Natural Language ◽

Database Systems ◽

Natural Language Interaction

Download Full-text

Cache-efficient sweeping-based interval joins for extended Allen relation predicates

The VLDB Journal ◽

10.1007/s00778-020-00650-5 ◽

2021 ◽

Author(s):

Danila Piatov ◽

Sven Helmer ◽

Anton Dignös ◽

Fabio Persia

Keyword(s):

Data Structure ◽

Experimental Evaluation ◽

State Of The Art ◽

Temporal Databases ◽

Access Method ◽

Wide Range ◽

Interval Relation ◽

Cache Efficient ◽

Join Algorithms ◽

Better Than

AbstractWe develop a family of efficient plane-sweeping interval join algorithms for evaluating a wide range of interval predicates such as Allen’s relationships and parameterized relationships. Our technique is based on a framework, components of which can be flexibly combined in different manners to support the required interval relation. In temporal databases, our algorithms can exploit a well-known and flexible access method, the Timeline Index, thus expanding the set of operations it supports even further. Additionally, employing a compact data structure, the gapless hash map, we utilize the CPU cache efficiently. In an experimental evaluation, we show that our approach is several times faster and scales better than state-of-the-art techniques, while being much better suited for real-time event processing.

Download Full-text

On implementing temporal coalescing in temporal databases implemented on top of relational database systems

Proceedings of the International Conference on Advances in Computing, Communication and Control - ICAC3 '09 ◽

10.1145/1523103.1523135 ◽

2009 ◽

Author(s):

K. Unnikrishnan ◽

K. V Pramod

Keyword(s):

Relational Database ◽

Database Systems ◽

Temporal Databases ◽

Relational Database Systems

Download Full-text

A mechanism for natural language access to database systems

ACM SIGART Bulletin ◽

10.1145/1056663.1056683 ◽

1982 ◽

pp. 44-45

Author(s):

Lawrence J. Mazlack ◽

Richard A. Feinauer ◽

William E. Leigh ◽

Neomi Paz

Keyword(s):

Natural Language ◽

Database Systems ◽

Language Access

Download Full-text

Efficient Discovery of Periodic-Frequent Patterns in Columnar Temporal Databases

Electronics ◽

10.3390/electronics10121478 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1478

Author(s):

Penugonda Ravikumar ◽

Palla Likhitha ◽

Bathala Venus Vikranth Raj ◽

Rage Uday Kiran ◽

Yutaka Watanobe ◽

...

Keyword(s):

Real World ◽

Pattern Mining ◽

Transportation Network ◽

Frequent Pattern Mining ◽

Temporal Databases ◽

Frequent Pattern ◽

Frequent Patterns ◽

Temporal Database ◽

First Case

Discovering periodic-frequent patterns in temporal databases is a challenging problem of great importance in many real-world applications. Though several algorithms were described in the literature to tackle the problem of periodic-frequent pattern mining, most of these algorithms use the traditional horizontal (or row) database layout, that is, either they need to scan the database several times or do not allow asynchronous computation of periodic-frequent patterns. As a result, this kind of database layout makes the algorithms for discovering periodic-frequent patterns both time and memory inefficient. One cannot ignore the importance of mining the data stored in a vertical (or columnar) database layout. It is because real-world big data is widely stored in columnar database layout. With this motivation, this paper proposes an efficient algorithm, Periodic Frequent-Equivalence CLass Transformation (PF-ECLAT), to find periodic-frequent patterns in a columnar temporal database. Experimental results on sparse and dense real-world and synthetic databases demonstrate that PF-ECLAT is memory and runtime efficient and highly scalable. Finally, we demonstrate the usefulness of PF-ECLAT with two case studies. In the first case study, we have employed our algorithm to identify the geographical areas in which people were periodically exposed to harmful levels of air pollution in Japan. In the second case study, we have utilized our algorithm to discover the set of road segments in which congestion was regularly observed in a transportation network.

Download Full-text

Natural language processing and knowledge representation: Language for knowledge and knowledge for language

Computers & Mathematics with Applications ◽

10.1016/s0898-1221(01)90011-2 ◽

2001 ◽

Vol 41 (3-4) ◽

pp. 536-537

Keyword(s):

Natural Language Processing ◽

Knowledge Representation ◽

Natural Language ◽

Language Processing ◽

Knowledge Representation Language ◽

Representation Language

Download Full-text

Application of natural language processing methods to extract coded data from administrative data held in the Scottish Prescribing Information System

International Journal for Population Data Science ◽

10.23889/ijpds.v1i1.263 ◽

2017 ◽

Vol 1 (1) ◽

Author(s):

Clifford Nangle ◽

Stuart McTaggart ◽

Margaret MacLeod ◽

Jackie Caldwell ◽

Marion Bennie

Keyword(s):

Information System ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Exposure ◽

Drug Dose ◽

Free Text ◽

Wide Range ◽

The Impact ◽

Prescribing Information

ABSTRACT ObjectivesThe Prescribing Information System (PIS) datamart, hosted by NHS National Services Scotland receives around 90 million electronic prescription messages per year from GP practices across Scotland. Prescription messages contain information including drug name, quantity and strength stored as coded, machine readable, data while prescription dose instructions are unstructured free text and difficult to interpret and analyse in volume. The aim, using Natural Language Processing (NLP), was to extract drug dose amount, unit and frequency metadata from freely typed text in dose instructions to support calculating the intended number of days’ treatment. This then allows comparison with actual prescription frequency, treatment adherence and the impact upon prescribing safety and effectiveness. ApproachAn NLP algorithm was developed using the Ciao implementation of Prolog to extract dose amount, unit and frequency metadata from dose instructions held in the PIS datamart for drugs used in the treatment of gastrointestinal, cardiovascular and respiratory disease. Accuracy estimates were obtained by randomly sampling 0.1% of the distinct dose instructions from source records, comparing these with metadata extracted by the algorithm and an iterative approach was used to modify the algorithm to increase accuracy and coverage. ResultsThe NLP algorithm was applied to 39,943,465 prescription instructions issued in 2014, consisting of 575,340 distinct dose instructions. For drugs used in the gastrointestinal, cardiovascular and respiratory systems (i.e. chapters 1, 2 and 3 of the British National Formulary (BNF)) the NLP algorithm successfully extracted drug dose amount, unit and frequency metadata from 95.1%, 98.5% and 97.4% of prescriptions respectively. However, instructions containing terms such as ‘as directed’ or ‘as required’ reduce the usability of the metadata by making it difficult to calculate the total dose intended for a specific time period as 7.9%, 0.9% and 27.9% of dose instructions contained terms meaning ‘as required’ while 3.2%, 3.7% and 4.0% contained terms meaning ‘as directed’, for drugs used in BNF chapters 1, 2 and 3 respectively. ConclusionThe NLP algorithm developed can extract dose, unit and frequency metadata from text found in prescriptions issued to treat a wide range of conditions and this information may be used to support calculating treatment durations, medicines adherence and cumulative drug exposure. The presence of terms such as ‘as required’ and ‘as directed’ has a negative impact on the usability of the metadata and further work is required to determine the level of impact this has on calculating treatment durations and cumulative drug exposure.

Download Full-text