Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures

The existence of a touch of technology on laboratory learning becomes another alternative as a supporter of laboratory learning. Different practitioner's wishes and intensity of relatively short laboratory practice which resulted in dissatisfaction in the implementation of a practicum. Thus, an intelligent learning alternative is needed. This intelligent learning aims to provide high-quality and high-performance training skills that can assist the practitioner in solving problems related to practicum materials. The intelligent learning system is a learning system that handles some student instruction without any intervention from a teacher.Alternative learning system that can support the creation of Intelligent Learning System is by Natural Language Processing (NLP) method. This final project provides an explanation of the creation and implementation of intelligent learning systems in the Object Oriented Programming Computer Laboratory. This system consists of several stages: parsing, similarity, stemming, Knowledge Base which is designed in an interactive form between praktikan and agent based dialoge based application. The success rate of this system in answering questions from praktikan session II is 88.75%.

Download Full-text

Ontology Formation and Comparison for Syllabus Structure Using NLP

10.52458/978-93-91842-08-6-6 ◽

2021 ◽

Author(s):

Masoom Raza ◽

Aditee Patil ◽

Mangesh Bedekar ◽

Rashmi Phalnikar ◽

Bhavana Tiple

Keyword(s):

Natural Language Processing ◽

Software Engineering ◽

Natural Language ◽

Comparative Study ◽

Language Processing ◽

Shared Knowledge ◽

The Subject ◽

The Creation ◽

The Right ◽

Collection And Management

Ontologies are largely responsible for the creation of a framework or taxonomy for a particular domain which represents the shared knowledge, concepts and how these concepts are related with each other. This paper shows the usage of ontology for the comparison of a syllabus structure of universities. This is done with the extraction of the syllabus, creation of ontology for the representing syllabus, then parsing the ontology and applying Natural language processing to remove unwanted information. After getting the appropriate ontologies, a comparative study is made on them. Restrictions are made over the extracted syllabus to the subject “Software Engineering” for convenience. This depicts the collection and management of ontology knowledge and processing it in the right manner to get the desired insights.

Download Full-text

Modern Linguistic Technologies: Strategy for Teaching Translation Studies

Rupkatha Journal on Interdisciplinary Studies in Humanities ◽

10.21659/rupkatha.v13n4.65 ◽

2021 ◽

Vol 13 (4) ◽

Author(s):

Bilous O ◽

◽

Mishchenko A ◽

Datska T ◽

Ivanenko N ◽

...

Keyword(s):

Natural Language ◽

Machine Translation ◽

Language Processing ◽

New Technologies ◽

Large Data ◽

Student Autonomy ◽

Linguistic Resources ◽

Modern Computer ◽

Key Factor ◽

The Creation

How often students use IT resources is a key factor in the acquisition of skills associated to the new technologies. Strategies aimed at increasing student autonomy need to be developed and should offer resources that encourage them to make use of computing tools in class hours. The analysis of the modern linguistic technologies, concerning intellectual language processing necessary for the creation and function of the highly effective technologies of knowledge operation was considered in the paper under consideration. Computerization of the information sphere has triggered extensive search for solving the problem of the use of natural language mechanisms in automated systems of various types. One of them was creating Controlled languages based on a set of features which made machine translation more refined. Triggered by the economic demand, they are not artificial languages like Esperanto, but natural simplified languages, in terms of vocabulary, grammatical and syntactic structures. More than ever, the tasks of modern computer linguistics behold creating software for natural language processing, information retrieval in large data sets, support of technical authors in the process of creating professional texts and users of computer technology, hence creating new translation tools. Such powerful linguistic resources as corpora of texts, terminology databases and ontologies may facilitate more efficient use of modern multilingual information technology. Creating and improving all methods considered will help make the job of a translator more efficient. One of the programs, CLAT does not aim at producing machine translation, but allows technical editors to create flawless, sequential professional texts through integrated punctuation and spelling modules. Other programs under consideration are to be implemented in Ukrainian translation departments. Moreover, the databases considered in the paper enable studying of the dynamics of the linguistic system and developing areas of applied research such as terminography, terminology, automated data processing etc. Effective cooperation of developers, translators and declarative institutes in the creation of innovative linguistic technologies will promote further development of translation and applied linguistics.

Download Full-text

A lexicon of Albanian for natural language processing

Lexicographica - International Annual for Lexicography / Internationales Jahrbuch für Lexikographie ◽

10.1515/lex-2018-340112 ◽

2018 ◽

Vol 34 (1) ◽

pp. 239-248

Author(s):

Besim Kabashi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Proper Names ◽

Specific Information ◽

Basic Information ◽

Parts Of Speech ◽

Work In Progress ◽

The Creation

AbstractFor many applications in the field of natural language processing, a lexicon is needed. For the Albanian language a lexicon that can be used for these purposes is presented below. The lexicon contains around 75,000 entries, including proper names such as personal, geographical and other names. Each entry includes grammatical information such as parts of speech and other specific information, e.g. inflection classes for nouns, adjectives and verbs. The lexicon is part of a morphological tool, but can also be used as an independent resource for other tasks and applications or can be adapted for them. Sources for the creation and the extension of the presented lexicon include both information from traditional dictionaries, e.g. spelling dictionaries, and a balanced linguistic corpus using corpus- driven methods and tools. The lexicon is still work in progress, but aims to cover basic information for most frequent tasks of natural language processing.

Download Full-text

Natural Language Processing for Requirements Engineering

ACM Computing Surveys ◽

10.1145/3444689 ◽

2021 ◽

Vol 54 (3) ◽

pp. 1-41

Author(s):

Liping Zhao ◽

Waad Alhoshan ◽

Alessio Ferrari ◽

Keletso J. Letsholo ◽

Muideen A. Ajagbe ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Requirements Engineering ◽

Language Processing ◽

New Technologies ◽

The State ◽

Mapping Study ◽

Pos Tagging ◽

Starting Point ◽

Holistic Understanding

Natural Language Processing for Requirements Engineering (NLP4RE) is an area of research and development that seeks to apply natural language processing (NLP) techniques, tools, and resources to the requirements engineering (RE) process, to support human analysts to carry out various linguistic analysis tasks on textual requirements documents, such as detecting language issues, identifying key domain concepts, and establishing requirements traceability links. This article reports on a mapping study that surveys the landscape of NLP4RE research to provide a holistic understanding of the field. Following the guidance of systematic review, the mapping study is directed by five research questions, cutting across five aspects of NLP4RE research, concerning the state of the literature, the state of empirical research, the research focus, the state of tool development, and the usage of NLP technologies. Our main results are as follows: (i) we identify a total of 404 primary studies relevant to NLP4RE, which were published over the past 36 years and from 170 different venues; (ii) most of these studies (67.08%) are solution proposals, assessed by a laboratory experiment or an example application, while only a small percentage (7%) are assessed in industrial settings; (iii) a large proportion of the studies (42.70%) focus on the requirements analysis phase, with quality defect detection as their central task and requirements specification as their commonly processed document type; (iv) 130 NLP4RE tools (i.e., RE specific NLP tools) are extracted from these studies, but only 17 of them (13.08%) are available for download; (v) 231 different NLP technologies are also identified, comprising 140 NLP techniques, 66 NLP tools, and 25 NLP resources, but most of them—particularly those novel NLP techniques and specialized tools—are used infrequently; by contrast, commonly used NLP technologies are traditional analysis techniques (e.g., POS tagging and tokenization), general-purpose tools (e.g., Stanford CoreNLP and GATE) and generic language lexicons (WordNet and British National Corpus). The mapping study not only provides a collection of the literature in NLP4RE but also, more importantly, establishes a structure to frame the existing literature through categorization, synthesis and conceptualization of the main theoretical concepts and relationships that encompass both RE and NLP aspects. Our work thus produces a conceptual framework of NLP4RE. The framework is used to identify research gaps and directions, highlight technology transfer needs, and encourage more synergies between the RE community, the NLP one, and the software and systems practitioners. Our results can be used as a starting point to frame future studies according to a well-defined terminology and can be expanded as new technologies and novel solutions emerge.

Download Full-text

A lexicon of Albanian for natural language processing

Lexicographica - International Annual for Lexicography / Internationales Jahrbuch für Lexikographie ◽

10.1515/lex-2018-0012 ◽

2019 ◽

Vol 34 (2018) ◽

pp. 239-248

Author(s):

Besim Kabashi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Proper Names ◽

Specific Information ◽

Basic Information ◽

Parts Of Speech ◽

Work In Progress ◽

The Creation

AbstractFor many applications in the field of natural language processing, a lexicon is needed. For the Albanian language a lexicon that can be used for these purposes is presented below. The lexicon contains around 75,000 entries, including proper names such as personal, geographical and other names. Each entry includes grammatical information such as parts of speech and other specific information, e.g. inflection classes for nouns, adjectives and verbs. The lexicon is part of a morphological tool, but can also be used as an independent resource for other tasks and applications or can be adapted for them. Sources for the creation and the extension of the presented lexicon include both information from traditional dictionaries, e.g. spelling dictionaries, and a balanced linguistic corpus using corpus-driven methods and tools. The lexicon is still work in progress, but aims to cover basic information for most frequent tasks of natural language processing.

Download Full-text

Dictionaries of Mexican Sexual Slang for NLP

CLEI electronic journal ◽

10.19153/cleiej.20.1.7 ◽

2018 ◽

Author(s):

Roberto Villarejo-Martínez ◽

Noé Alejandro Castro-Sánchez ◽

Gerardo Sierra Martínez

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Human Computer Interaction ◽

Language Processing ◽

User Interfaces ◽

Scientific Community ◽

Recognition Problem ◽

Direct Impact ◽

Mexican Spanish ◽

The Creation

Abstract: In this paper the creation of two relevant resources for the double entendre and humour recognition problem in Mexican Spanish is described: a morphological dictionary and a semantic dictionary. These were created from two sources: a corpus of albures (drawn from “Antología del albur” book) and a Mexican slang dictionary (“El chilangonario”). The morphological dictionary consists of 410 forms of words that corresponds to 350 lemmas. The semantic dictionary consists of 27 synsets that are associated to lemmas of morphological dictionary. Since both resources are based on Freeling library, they are easy to implement for tasks in Natural Language Processing. The motivation for this work comes from the need to address problems such as double entendre and computational humour. The usefulness of these disciplines has been discussed many times and it has been shown that they have a direct impact on user interfaces and, mainly, in human-computer interaction. This work aims to promote that the scientific community generates more resources about informal language in Spanish and other languages. Spanish Abstract: En este artículo se describe la creación de dos recursos relevantes para el reconocimiento del doble sentido y el humor en el español mexicano: un diccionario morfológico y un diccionario semántico. Éstos fueron creados a partir de dos fuentes: un corpus de albures (extraídos del libro "Antología del albur") y un diccionario de argot mexicano ("El chilangonario"). El diccionario morfológico consiste en 410 formas de palabras que corresponden a 350 lemas. El diccionario semántico consiste en 27 synsets que están asociados a lemas del diccionario morfológico. Puesto que ambos recursos están basados en la biblioteca Freeling, son fáciles de implementar en tareas de Procesamiento del Lenguaje Natural. La motivación de este trabajo proviene de la necesidad de abordar problemas como el doble sentido y el humor computacional. La utilidad de estas disciplinas han sido discutidas muchas veces y se ha mostrado que tienen un impacto directo en las interfaces de usuario y, principalmente, en la interacción humano-computadora. Este trabajo tiene como objetivo promover que la comunidad científica genere más recursos sobre el lenguaje informal en español y otros lenguajes.

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text

Natural Language Processing in the Clinical Setting

PsycEXTRA Dataset ◽

10.1037/e615572012-013 ◽

2012 ◽

Author(s):

Thomas H. Payne

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Setting

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text