Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics

In recent years, the application of opinion mining has increased as a boom and growth of social media and blogs on the web, and these sources generate a large volume of unstructured data; therefore, a manual review is not feasible. For this reason, it has become necessary to apply web scraping and opinion mining techniques, two primary processes that help to obtain and summarize the data. Opinion mining, among its various areas of application, stands out for its essential contribution in the context of healthcare, especially for pharmacovigilance, because it allows finding adverse drug events omitted by the pharmaceutical companies. This chapter proposes a hybrid approach that uses semantics and machine learning for an opinion mining-analysis system by applying natural-language-processing techniques for the detection of drug polarity for chronic-degenerative diseases, available in blogs and specialized websites in the Spanish language.

Download Full-text

Improving the K-Means Clustering Algorithm Oriented to Big Data Environments

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch013 ◽

2021 ◽

pp. 289-308

Author(s):

Joaquín Pérez Ortega ◽

Nelva Nely Almanza Ortega ◽

Andrea Vega Villalobos ◽

Marco A. Aguirre L. ◽

Crispín Zavala Díaz ◽

...

Keyword(s):

Big Data ◽

Text Mining ◽

Large Volume ◽

Execution Time ◽

Clustering Algorithm ◽

Efficient Algorithms ◽

Experimental Results ◽

Digital Format ◽

Basic Approaches ◽

Previous Iteration

In recent years, the amount of texts in natural language, in digital format, has had an impressive increase. To obtain useful information from a large volume of data, new specialized techniques and efficient algorithms are required. Text mining consists of extracting meaningful patterns from texts; one of the basic approaches is clustering. The most used clustering algorithm is k-means. This chapter proposes an improvement of the k-means algorithm in the convergence step; the process stops whenever the number of objects that change their assigned cluster in the current iteration is bigger than the ones that changed in the previous iteration. Experimental results showed a reduction in execution time up to 93%. It is remarkable that, in general, better results are obtained when the volume of the text increase, particularly in those texts within big data environments.

Download Full-text

Issues in the Syntactic Parsing of Queries for a Natural Language Interface to Databases

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch007 ◽

2021 ◽

pp. 157-179

Author(s):

Alexander Gelbukh ◽

José A. Martínez F. ◽

Andres Verastegui ◽

Alberto Ochoa

Keyword(s):

Natural Language ◽

State Of The Art ◽

Experimental Tests ◽

Syntactic Parsing ◽

Natural Language Interfaces ◽

Natural Language Interface ◽

Overall Performance

In this chapter, an exhaustive parser is presented. The parser was developed to be used in a natural language interface to databases (NLIDB) project. This chapter includes a brief description of state-of-the-art NLIDBs, including a description of the methods used and the performance of some interfaces. Some of the general problems in natural language interfaces to databases are also explained. The exhaustive parser was developed, aiming at improving the overall performance of the interface; therefore, the interface is also briefly described. This chapter also presents the drawbacks discovered during the experimental tests of the parser, which show that it is unsuitable for improving the NLIDB performance.

Download Full-text

Two New Challenging Resources to Evaluate Natural Language Interfaces to Databases Generated Based on Geobase and Geoquery

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch004 ◽

2021 ◽

pp. 70-100

Author(s):

Juan Javier González-Barbosa ◽

Juan Frausto Solís ◽

Juan Paulo Sánchez-Hernández ◽

Julia Patricia Sanchez-Solís

Keyword(s):

Natural Language ◽

Relational Database ◽

Normalization Procedure ◽

Natural Language Interfaces ◽

Language Problems

Databases and corpora are essential resources to evaluate the performance of Natural Language Interfaces to Databases (NLIDB). The Geobase database and the Geoquery corpus (Geoquery250 and Geoquery880) are among the most commonly used. In this chapter, the authors analyze both resources to offer two elaborate resources: 1) N-Geobase, which is a relational database, and 2) the corpus Geoquery270. The former follows the standard normalization procedure, then N-Geobase has a schema similar to enterprise databases. Geoquery270 consists of 270 queries selected from Geoquery880, preserving the same kind of natural language problems as Geoquery880, but with more challenging issues for an NLIDB than Geoquery250. To evaluate the new resources, they compared the performance of the NLIDB using Geoquery270 and Geoquery250. The results indicated that Geoquery270 was the harder corpus, while Geoquery250 is the easier one. Consequently, this chapter offers a broader range of resources to NLIDB designers.

Download Full-text

Story Summarization Using a Question-Answering Approach

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch003 ◽

2021 ◽

pp. 46-69

Author(s):

Sanah Nashir Sayyed ◽

Namrata Mahender C.

Keyword(s):

Statistical Methods ◽

Question Answering ◽

Text Summarization ◽

Loss Of Information ◽

Representative Data ◽

Minimal Loss ◽

Speech Data ◽

Linguistic Methods ◽

The Given ◽

Text Images

Summarization is the process of selecting representative data to produce a reduced version of the given data with a minimal loss of information; so, it generally works on text, images, videos, and speech data. The chapter deals with not only concepts of text summarization (types, stages, issues, and criteria) but also with applications. The two main categories of approaches generally used in text summaries (i.e., abstractive and extractive) are discussed. Abstractive techniques use linguistic methods to interpret the text; they produce understandable and semantically equivalent sentences with a shorter length. Extractive techniques mostly rely on statistical methods for extracting essential sentences from the given text. In addition, the authors explore the SACAS model to exemplify the process of summarization. The SACAS system analyzed 50 stories, and its evaluation is presented in terms of a new measurement based on question-answering MOS, which is also introduced in this chapter.

Download Full-text

Natural Language Interfaces to Databases

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch001 ◽

2021 ◽

pp. 1-30

Author(s):

Rodolfo A. Pazos-Rangel ◽

Gilberto Rivera ◽

José A. Martínez F. ◽

Juana Gaspar ◽

Rogelio Florencia-Juárez

Keyword(s):

Natural Language ◽

Performance Measures ◽

Experimental Evaluation ◽

Natural Language Interfaces ◽

Relevance Criteria ◽

Number Of Citations

This chapter consists of an update of a previous publication. Specifically, the chapter aims at describing the most decisive advances in NLIDBs of this decade. Unlike many surveys on NLIDBs, for this chapter, the NLIDBs will be selected according to three relevance criteria: performance (i.e., percentage of correctly answered queries), soundness of the experimental evaluation, and the number of citations. To this end, the chapter will also include a brief review of the most widely used performance measures and query corpora for testing NLIDBs.

Download Full-text

Opinion Mining for Instructor Evaluations at the Autonomous University of Ciudad Juarez

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch020 ◽

2021 ◽

pp. 427-444

Author(s):

Rafael Jiménez ◽

Vicente García ◽

Abraham López ◽

Alejandra Mendoza Carreón ◽

Alan Ponce

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Opinion Mining ◽

Experimental Results ◽

Ciudad Juarez ◽

Teaching Process ◽

Instructor Evaluation ◽

Negative Comments ◽

Processing Techniques

The Autonomous University of Ciudad Juárez performs an instructor evaluation each semester to find strengths, weaknesses, and areas of opportunity during the teaching process. In this chapter, the authors show how opinion mining can be useful for labeling student comments as positives and negatives. For this purpose, a database was created using real opinions obtained from five professors of the UACJ over the last four years, covering a total of 20 subjects. Natural language processing techniques were used on the database to normalize its data. Experimental results using 1-NN and Bagging classifiers shows that it is possible to automatically label positive and negative comments with an accuracy of 80.13%.

Download Full-text

Extractive Text Summarization Methods in the Spanish Language

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch018 ◽

2021 ◽

pp. 379-391

Author(s):

Irvin Raul Lopez Contreras ◽

Alejandra Mendoza Carreón ◽

Jorge Rodas-Osollo ◽

Martiza Concepción Varela

Keyword(s):

Text Summarization ◽

Spanish Language ◽

Digital Documents ◽

The World ◽

And Performance ◽

The Web

The quantity of information in the world is increasing every day on a fast level. This fact will be an obstacle in some situations; text summarization is involved in this kind of problem. It is used to minimize the time that people spend searching for information on the web and in a lot of digital documents. In this chapter, three algorithms were compared; all of them are an extractive text summarization algorithm. Popular libraries that influence the performance of these kinds of algorithms were used. It was necessary to configure and modify these methods so that they work for the Spanish language instead of their original one. The authors use some metrics found in the literature to evaluate the quality and performance of these algorithms.

Download Full-text

Geospatial Situation Analysis for the Prediction of Possible Cases of Suicide Using EBK

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch015 ◽

2021 ◽

pp. 327-346

Author(s):

Carlos Manuel Ramirez López ◽

Martín Montes Rivera ◽

Alberto Ochoa ◽

Julio César Ponce Gallegos ◽

José Eder Guzmán Mendoza

Keyword(s):

Suicide Prevention ◽

Interpolation Method ◽

Error Prediction ◽

Empirical Bayesian ◽

Bayesian Kriging ◽

Empirical Bayesian Kriging ◽

Completed Suicide ◽

Geostatistical Interpolation ◽

Mexican State

This research presents the application of Empirical Bayesian Kriging, a geostatistical interpolation method. The case study is about suicide prevention. The dataset is composed of more than one million records, obtained from the report database of the Emergency Service 911 of the Mexican State of Aguascalientes. The purpose is to get prediction surfaces, probability, and standard error prediction for completed suicide cases. Here, the variations in the environment of suicide cases are relative to and dependent on economic, social, and cultural phenomena.

Download Full-text

Pronominal Anaphora Resolution on Spanish Text

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch014 ◽

2021 ◽

pp. 309-326

Author(s):

Alonso García ◽

Martha Victoria González ◽

Francisco López-Orozco ◽

Lucero Zamora

Keyword(s):

Spanish Text ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Spanish Language ◽

Online Version ◽

Anaphora Resolution ◽

Personal Pronouns ◽

Technological Advances ◽

Pronominal Anaphora

Recent technological advances have allowed the development of numerous natural language processing applications with which users frequently interact. When interacting with this type of application, users often search for the economy of words, which promotes the use of pronouns, thereby highlighting the well-known anaphora problem. This chapter describes a proposal to approach the pronominal anaphora for the Spanish language. A set of rules (based on the Eagle standard) was designed to identify the referents of personal pronouns through the structure of the grammatical tags of the words. The proposed algorithm uses the online Freeling service to perform tokenization and tagging tasks. The performance of the algorithm was compared with an online version of Freeling, and the proposed algorithm shows better performance.

Download Full-text

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

An Opinion Mining Approach for Drug Reviews in Spanish

Improving the K-Means Clustering Algorithm Oriented to Big Data Environments

Issues in the Syntactic Parsing of Queries for a Natural Language Interface to Databases

Two New Challenging Resources to Evaluate Natural Language Interfaces to Databases Generated Based on Geobase and Geoquery

Story Summarization Using a Question-Answering Approach

Natural Language Interfaces to Databases

Opinion Mining for Instructor Evaluations at the Autonomous University of Ciudad Juarez

Extractive Text Summarization Methods in the Spanish Language

Geospatial Situation Analysis for the Prediction of Possible Cases of Suicide Using EBK

Pronominal Anaphora Resolution on Spanish Text

Export Citation Format

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and RoboticsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By IGI Global

An Opinion Mining Approach for Drug Reviews in Spanish

Improving the K-Means Clustering Algorithm Oriented to Big Data Environments

Issues in the Syntactic Parsing of Queries for a Natural Language Interface to Databases

Two New Challenging Resources to Evaluate Natural Language Interfaces to Databases Generated Based on Geobase and Geoquery

Story Summarization Using a Question-Answering Approach

Natural Language Interfaces to Databases

Opinion Mining for Instructor Evaluations at the Autonomous University of Ciudad Juarez

Extractive Text Summarization Methods in the Spanish Language

Geospatial Situation Analysis for the Prediction of Possible Cases of Suicide Using EBK

Pronominal Anaphora Resolution on Spanish Text

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics
Latest Publications