scholarly journals Dictionaries of Mexican Sexual Slang for NLP

Author(s):  
Roberto Villarejo-Martínez ◽  
Noé Alejandro Castro-Sánchez ◽  
Gerardo Sierra Martínez

Abstract: In this paper the creation of two relevant resources for the double entendre and humour recognition problem in Mexican Spanish is described: a morphological dictionary and a semantic dictionary. These were created from two sources: a corpus of albures (drawn from “Antología del albur” book) and a Mexican slang dictionary (“El chilangonario”). The morphological dictionary consists of 410 forms of words that corresponds to 350 lemmas. The semantic dictionary consists of 27 synsets that are associated to lemmas of morphological dictionary. Since both resources are based on Freeling library, they are easy to implement for tasks in Natural Language Processing. The motivation for this work comes from the need to address problems such as double entendre and computational humour. The usefulness of these disciplines has been discussed many times and it has been shown that they have a direct impact on user interfaces and, mainly, in human-computer interaction. This work aims to promote that the scientific community generates more resources about informal language in Spanish and other languages.  Spanish Abstract: En este artículo se describe la creación de dos recursos relevantes para el reconocimiento del doble sentido y el humor en el español mexicano: un diccionario morfológico y un diccionario semántico. Éstos fueron creados a partir de dos fuentes: un corpus de albures (extraídos del libro "Antología del albur") y un diccionario de argot mexicano ("El chilangonario"). El diccionario morfológico consiste en 410 formas de palabras que corresponden a 350 lemas. El diccionario semántico consiste en 27 synsets que están asociados a lemas del diccionario morfológico. Puesto que ambos recursos están basados en la biblioteca Freeling, son fáciles de implementar en tareas de Procesamiento del Lenguaje Natural. La motivación de este trabajo proviene de la necesidad de abordar problemas como el doble sentido y el humor computacional. La utilidad de estas disciplinas han sido discutidas muchas veces y se ha mostrado que tienen un impacto directo en las interfaces de usuario y, principalmente, en la interacción humano-computadora. Este trabajo tiene como objetivo promover que la comunidad científica genere más recursos sobre el lenguaje informal en español y otros lenguajes. 

2018 ◽  
Author(s):  
Khairil Anam ◽  
SEHMAN

The existence of a touch of technology on laboratory learning becomes another alternative as a supporter of laboratory learning. Different practitioner's wishes and intensity of relatively short laboratory practice which resulted in dissatisfaction in the implementation of a practicum. Thus, an intelligent learning alternative is needed. This intelligent learning aims to provide high-quality and high-performance training skills that can assist the practitioner in solving problems related to practicum materials. The intelligent learning system is a learning system that handles some student instruction without any intervention from a teacher.Alternative learning system that can support the creation of Intelligent Learning System is by Natural Language Processing (NLP) method. This final project provides an explanation of the creation and implementation of intelligent learning systems in the Object Oriented Programming Computer Laboratory. This system consists of several stages: parsing, similarity, stemming, Knowledge Base which is designed in an interactive form between praktikan and agent based dialoge based application. The success rate of this system in answering questions from praktikan session II is 88.75%.


2021 ◽  
Author(s):  
Masoom Raza ◽  
Aditee Patil ◽  
Mangesh Bedekar ◽  
Rashmi Phalnikar ◽  
Bhavana Tiple

Ontologies are largely responsible for the creation of a framework or taxonomy for a particular domain which represents the shared knowledge, concepts and how these concepts are related with each other. This paper shows the usage of ontology for the comparison of a syllabus structure of universities. This is done with the extraction of the syllabus, creation of ontology for the representing syllabus, then parsing the ontology and applying Natural language processing to remove unwanted information. After getting the appropriate ontologies, a comparative study is made on them. Restrictions are made over the extracted syllabus to the subject “Software Engineering” for convenience. This depicts the collection and management of ontology knowledge and processing it in the right manner to get the desired insights.


2015 ◽  
Author(s):  
Vijaykumar Yogesh Muley ◽  
Anne Hahn ◽  
Pravin Paikrao

Natural language processing continues to gain importance in a thriving scientific community that communicates its latest results in such a frequency that following up on the most recent developments even in a specific field cannot be managed by human readers alone. Here we summarize and compare the publishing activity of the previous years on a distinct topic across several countries, addressing not only publishing frequency and history, but also stylistic characteristics that are accessible by means of natural language processing. Though there are no profound differences in the sentence lengths or lexical diversity among different countries, writing styles approached by Part-Of-Speech tagging are similar among countries that share history or official language or those are spatially close.


Author(s):  
Besim Kabashi

AbstractFor many applications in the field of natural language processing, a lexicon is needed. For the Albanian language a lexicon that can be used for these purposes is presented below. The lexicon contains around 75,000 entries, including proper names such as personal, geographical and other names. Each entry includes grammatical information such as parts of speech and other specific information, e.g. inflection classes for nouns, adjectives and verbs. The lexicon is part of a morphological tool, but can also be used as an independent resource for other tasks and applications or can be adapted for them. Sources for the creation and the extension of the presented lexicon include both information from traditional dictionaries, e.g. spelling dictionaries, and a balanced linguistic corpus using corpus- driven methods and tools. The lexicon is still work in progress, but aims to cover basic information for most frequent tasks of natural language processing.


Author(s):  
Besim Kabashi

AbstractFor many applications in the field of natural language processing, a lexicon is needed. For the Albanian language a lexicon that can be used for these purposes is presented below. The lexicon contains around 75,000 entries, including proper names such as personal, geographical and other names. Each entry includes grammatical information such as parts of speech and other specific information, e.g. inflection classes for nouns, adjectives and verbs. The lexicon is part of a morphological tool, but can also be used as an independent resource for other tasks and applications or can be adapted for them. Sources for the creation and the extension of the presented lexicon include both information from traditional dictionaries, e.g. spelling dictionaries, and a balanced linguistic corpus using corpus-driven methods and tools. The lexicon is still work in progress, but aims to cover basic information for most frequent tasks of natural language processing.


Sign in / Sign up

Export Citation Format

Share Document