Natural language processing‐based lexical meaning analysis: an application of in‐network caching‐oriented translation system

2021 ◽  
Author(s):  
Bozhou Wang ◽  
Chunmei Jia
2004 ◽  
Vol 10 (1) ◽  
pp. 57-89 ◽  
Author(s):  
MARJORIE MCSHANE ◽  
SERGEI NIRENBURG ◽  
RON ZACHARSKI

The topic of mood and modality (MOD) is a difficult aspect of language description because, among other reasons, the inventory of modal meanings is not stable across languages, moods do not map neatly from one language to another, modality may be realised morphologically or by free-standing words, and modality interacts in complex ways with other modules of the grammar, like tense and aspect. Describing MOD is especially difficult if one attempts to develop a unified approach that not only provides cross-linguistic coverage, but is also useful in practical natural language processing systems. This article discusses an approach to MOD that was developed for and implemented in the Boas Knowledge-Elicitation (KE) system. Boas elicits knowledge about any language, L, from an informant who need not be a trained linguist. That knowledge then serves as the static resources for an L-to-English translation system. The KE methodology used throughout Boas is driven by a resident inventory of parameters, value sets, and means of their realisation for a wide range of language phenomena. MOD is one of those parameters, whose values are the inventory of attested and not yet attested moods (e.g. indicative, conditional, imperative), and whose realisations include flective morphology, agglutinating morphology, isolating morphology, words, phrases and constructions. Developing the MOD elicitation procedures for Boas amounted to wedding the extensive theoretical and descriptive research on MOD with practical approaches to guiding an untrained informant through this non-trivial task. We believe that our experience in building the MOD module of Boas offers insights not only into cross-linguistic aspects of MOD that have not previously been detailed in the natural language processing literature, but also into KE methodologies that could be applied more broadly.


1992 ◽  
Vol 01 (02) ◽  
pp. 229-277 ◽  
Author(s):  
MICHAEL MCCORD ◽  
ARENDSE BERNTH ◽  
SHALOM LAPPIN ◽  
WLODEK ZADROZNY

This paper contains brief descriptions of the latest form of Slot Grammar and four natural language processing systems developed in this framework. Slot Grammar is a lexicalist, dependency-oriented grammatical system, based on the systematic expression of linguistic rules and data in terms of slots (essentially grammatical relations) and slot frames. The exposition focuses on the kinds of analysis structures produced by the Slot Grammar parser. These structures offer convenient input to post-syntactic processing (in particular to the applications dealt with in the paper); they contain in a single structure a useful combination of surface structure and logical form. The four applications discussed are: (1) An anaphora resolution system dealing with both NP anaphora and VP anaphora (and combinations of the two). (2) A meaning postulate based inference system for natural language, in which inference is done directly with Slot Grammar analysis structures. (3) A new transfer system for the machine translation system LMT, based on a new representation for Slot Grammar analyses which allows more convenient tree exploration. (4) A parser of "constructions", viewed as an extension of the core grammar allowing one to handle some linguistic phenomena that are often labeled "extragrammatical", and to assign a semantics to them.


Terminology ◽  
1994 ◽  
Vol 1 (1) ◽  
pp. 61-95 ◽  
Author(s):  
Blaise Nkwenti-Azeh

Special-language term formation is characterised, inter alia, by the frequent reuse of certain lexical items in the formation of new syntagmatic units and by conceptually motivated restrictions on the position which certain elements can occupy within a compound term. This paper describes how the positional and combinational features of the terminology of a given domain can be identified from relevant existing term lists and used as part of a corpus-based, automatic term-identification strategy within a natural-language processing (e.g., machine-translation) system. The methodology described is exemplified and supported with data from the field of satellite communications.


2013 ◽  
Vol 8 (3) ◽  
pp. 908-912 ◽  
Author(s):  
Sumita Rani ◽  
Dr. Vijay Luxmi

Machine Translation System is an important area in Natural Language Processing. The Direct MT system is based upon the utilization of syntactic and vocabulary similarities between more or few related natural languages. The relation between two or more languages is based upon their common parent language. The similarity between Punjabi and Hindi languages is due to their parent language Sanskrit. Punjabi and Hindi are closely related languages with lots of similarities in syntax and vocabulary. In the present paper, Direct Machine Translation System from Punjabi to Hindi has been developed and its output is evaluated in order to get the suitability of the system.


2020 ◽  
pp. 3-17
Author(s):  
Peter Nabende

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.


Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1243-P
Author(s):  
JIANMIN WU ◽  
FRITHA J. MORRISON ◽  
ZHENXIANG ZHAO ◽  
XUANYAO HE ◽  
MARIA SHUBINA ◽  
...  

Author(s):  
Pamela Rogalski ◽  
Eric Mikulin ◽  
Deborah Tihanyi

In 2018, we overheard many CEEA-AGEC members stating that they have "found their people"; this led us to wonder what makes this evolving community unique. Using cultural historical activity theory to view the proceedings of CEEA-ACEG 2004-2018 in comparison with the geographically and intellectually adjacent ASEE, we used both machine-driven (Natural Language Processing, NLP) and human-driven (literature review of the proceedings) methods. Here, we hoped to build on surveys—most recently by Nelson and Brennan (2018)—to understand, beyond what members say about themselves, what makes the CEEA-AGEC community distinct, where it has come from, and where it is going. Engaging in the two methods of data collection quickly diverted our focus from an analysis of the data themselves to the characteristics of the data in terms of cultural historical activity theory. Our preliminary findings point to some unique characteristics of machine- and human-driven results, with the former, as might be expected, focusing on the micro-level (words and language patterns) and the latter on the macro-level (ideas and concepts). NLP generated data within the realms of "community" and "division of labour" while the review of proceedings centred on "subject" and "object"; both found "instruments," although NLP with greater granularity. With this new understanding of the relative strengths of each method, we have a revised framework for addressing our original question.  


Sign in / Sign up

Export Citation Format

Share Document