Automatic NLP for Competitive Intelligence

This chapter integrates elements from Natural Language Processing, Information Retrieval, Data Mining and Text Mining to support competitive intelligence. It shows how text mining algorithms can attend to three important functionalities of CI: Filtering, Event Alerts and Search. Each of them can be mapped as a different pipeline of NLP tasks. The chapter goes in-depth in NLP techniques like spelling correction, stemming, augmenting, normalization, entity recognition, entity classification, acronyms and co-reference process. Each of them must be used in a specific moment to do a specific job. All these jobs will be integrated in a whole system. These will be ‘assembled’ in a manner specific to each application. The reader’s better understanding of the theories of NLP provided herein will result in a better ´assembly´.

Download Full-text

Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458554 ◽

2019 ◽

Vol 53 (2) ◽

pp. 3-10

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

Digital Libraries ◽

State Of The Art ◽

Shared Task ◽

Processing Information ◽

Joint Workshop

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.

Download Full-text

Applying Natural Language Processing, Information Retrieval and Machine Learning to Decision Support in Medical Coordination in an Emergency Medicine Context

2015 IEEE 28th International Symposium on Computer-Based Medical Systems ◽

10.1109/cbms.2015.82 ◽

2015 ◽

Cited By ~ 2

Author(s):

Juliana Tarossi Pollettini ◽

Hugo Cesar Pessotti ◽

Antonio Pazin Filho ◽

Evandro Eduardo Seron Ruiz ◽

Mario Sergio Adolfi Junior

Keyword(s):

Machine Learning ◽

Emergency Medicine ◽

Information Retrieval ◽

Natural Language Processing ◽

Decision Support ◽

Natural Language ◽

Language Processing ◽

Processing Information

Download Full-text

LIS4: Lesk Inspired Sense Specific Semantic Similarity using WordNet

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500064 ◽

2021 ◽

pp. 2150006

Author(s):

Saravanakumar Kandasamy ◽

Aswani Kumar Cherukuri

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Gold Standard ◽

Question Answering ◽

Knowledge Based ◽

Benchmark Datasets ◽

Processing Information

Semantic similarity quantification between concepts is one of the inevitable parts in domains like Natural Language Processing, Information Retrieval, Question Answering, etc. to understand the text and their relationships better. Last few decades, many measures have been proposed by incorporating various corpus-based and knowledge-based resources. WordNet and Wikipedia are two of the Knowledge-based resources. The contribution of WordNet in the above said domain is enormous due to its richness in defining a word and all of its relationship with others. In this paper, we proposed an approach to quantify the similarity between concepts that exploits the synsets and the gloss definitions of different concepts using WordNet. Our method considers the gloss definitions, contextual words that are helping in defining a word, synsets of contextual word and the confidence of occurrence of a word in other word’s definition for calculating the similarity. The evaluation based on different gold standard benchmark datasets shows the efficiency of our system in comparison with other existing taxonomical and definitional measures.

Download Full-text

Events Automatic Extraction from Arabic Texts

Natural Language Processing ◽

10.4018/978-1-7998-0951-7.ch078 ◽

2020 ◽

pp. 1686-1704

Author(s):

Emna Hkiri ◽

Souheyl Mallat ◽

Mounir Zrigui

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Text Mining ◽

Machine Translation ◽

Language Processing ◽

Question Answering ◽

Arabic Language ◽

Event Extraction ◽

Mining Machine ◽

Open Domain

The event extraction task consists in determining and classifying events within an open-domain text. It is very new for the Arabic language, whereas it attained its maturity for some languages such as English and French. Events extraction was also proved to help Natural Language Processing tasks such as Information Retrieval and Question Answering, text mining, machine translation etc… to obtain a higher performance. In this article, we present an ongoing effort to build a system for event extraction from Arabic texts using Gate platform and other tools.

Download Full-text

ENCADEAr: ENCADEAmento automático de notícias

Oslo Studies in Language ◽

10.5617/osla.1457 ◽

2015 ◽

Vol 7 (1) ◽

Author(s):

Carla Abreu ◽

Jorge Teixeira ◽

Eugénio Oliveira

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Supervised Learning ◽

Language Processing ◽

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity ◽

Supervised Learning Algorithms ◽

Processing Information

This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.

Download Full-text

A Novel Metric to Quantify the Effect of Pathway Enrichment Evaluation With Respect to Biomedical Text-Mined Terms: Development and Feasibility Study (Preprint)

10.2196/preprints.28247 ◽

2021 ◽

Author(s):

Xuan Qin ◽

Xinzhi Yao ◽

Jingbo Xia

Keyword(s):

Natural Language Processing ◽

Text Mining ◽

Drug Use ◽

Natural Language ◽

Language Processing ◽

Entity Recognition ◽

Biomedical Text ◽

Biomedical Text Mining ◽

Related Gene ◽

Pathway Enrichment

BACKGROUND Natural language processing has long been applied in various applications for biomedical knowledge inference and discovery. Enrichment analysis based on named entity recognition is a classic application for inferring enriched associations in terms of specific biomedical entities such as gene, chemical, and mutation. OBJECTIVE The aim of this study was to investigate the effect of pathway enrichment evaluation with respect to biomedical text-mining results and to develop a novel metric to quantify the effect. METHODS Four biomedical text mining methods were selected to represent natural language processing methods on drug-related gene mining. Subsequently, a pathway enrichment experiment was performed by using the mined genes, and a series of inverse pathway frequency (IPF) metrics was proposed accordingly to evaluate the effect of pathway enrichment. Thereafter, 7 IPF metrics and traditional <i>P</i> value metrics were compared in simulation experiments to test the robustness of the proposed metrics. RESULTS IPF metrics were evaluated in a case study of rapamycin-related gene set. By applying the best IPF metrics in a pathway enrichment simulation test, a novel discovery of drug efficacy of rapamycin for breast cancer was replicated from the data chosen prior to the year 2000. Our findings show the effectiveness of the best IPF metric in support of knowledge discovery in new drug use. Further, the mechanism underlying the drug-disease association was visualized by Cytoscape. CONCLUSIONS The results of this study suggest the effectiveness of the proposed IPF metrics in pathway enrichment evaluation as well as its application in drug use discovery.

Download Full-text

Unsupervised Automatic Keyphrases Extraction on Italian Datasets

Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management ◽

10.4018/978-1-7998-3479-3.ch009 ◽

2021 ◽

pp. 107-126

Author(s):

Isabella Gagliardi ◽

Maria Teresa Artese

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Language Processing ◽

Important Research ◽

Keyphrase Extraction ◽

Research Activity ◽

Unsupervised Methods

Keyword/keyphrase extraction is an important research activity in text mining, natural language processing, and information retrieval. A large number of algorithms, divided into supervised or unsupervised methods, have been designed and developed to solve the problem of automatic keyphrases extraction. The aim of the chapter is to critically discuss the unsupervised automatic keyphrases extraction algorithms, analyzing in depth their characteristics. The methods presented will be tested on different datasets, presenting in detail the data, the algorithms, and the different options tested in the runs. Moreover, most of the studies and experiments have been conducted on texts in English, while there are few experiments concerning other languages, such as Italian. Particular attention will be paid to the evaluation of the results of the methods in two different languages, English, and Italian.

Download Full-text

A Classification of Data Mining Algorithms for Wireless Sensor Networks, and Classification Extension to Concept Modeling in System of Wireless Sensor Networks Based on Natural Language Processing

Advances in Computers - Connected Computing Environment ◽

10.1016/b978-0-12-408091-1.00004-x ◽

2013 ◽

pp. 223-283 ◽

Cited By ~ 1

Author(s):

StašaVujičić Stanković ◽

Nemanja Kojić ◽

Goran Rakočević ◽

Duško Vitas ◽

Veljko Milutinović

Keyword(s):

Data Mining ◽

Wireless Sensor Networks ◽

Natural Language Processing ◽

Sensor Networks ◽

Natural Language ◽

Language Processing ◽

Wireless Sensor ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

CAREER RECOMMENDER: A NOVEL APPROACH TO SUGGEST JOBS AND POST-GRADUATION STREAMS

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2017.v10s1.19758 ◽

2017 ◽

Vol 10 (13) ◽

pp. 365

Author(s):

Prafful Nath Mathur ◽

Abhishek Dixit ◽

Sakkaravarthi Ramanathan

Keyword(s):

Natural Language Processing ◽

Text Mining ◽

Natural Language ◽

Language Processing ◽

Technical Skills ◽

Web Crawler ◽

Novel Approach ◽

Current Location ◽

Job Postings ◽

Mining Algorithms

To implement a novel approach to recommend jobs and colleges based on résumé of freshly graduated students. Job postings are crawled from web using a web crawler and stored in a customized database. College lists are also retrieved for post-graduation streams and stored in a database. Student résumé is stored and parsed using natural language processing methods to form a résumé model. Text mining algorithms are applied on this model to extract useful information (i.e., degree, technical skills, extracurricular skills, current location, and hobbies). This information is used to suggest matching jobs and colleges to the candidate.

Download Full-text

Structural Text Mining

Encyclopedia of Information Science and Technology, First Edition ◽

10.4018/978-1-59140-553-5.ch472 ◽

2005 ◽

pp. 2658-2661

Author(s):

Vladimir A. Kulyukin ◽

John A. Nicholson

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Text Mining ◽

Language Processing ◽

World Wide ◽

Knowledge Bases ◽

Domain Specific ◽

Retrieval Systems ◽

The World ◽

Information Retrieval Systems

The advent of the World Wide Web has resulted in the creation of millions of documents containing unstructured, structured and semi-structured data. Consequently, research on structural text mining has come to the forefront of both information retrieval and natural language processing (Cardie, 1997; Freitag, 1998; Hammer, Garcia-Molina, Cho, Aranha, & Crespo, 1997; Hearst, 1992; Hsu & Chang, 1999; Jacquemin & Bush, 2000; Kushmerick, Weld, & Doorenbos, 1997). Knowledge of how information is organized and structured in texts can be of significant assistance to information systems that use documents as their knowledge bases (Appelt, 1999). In particular, such knowledge is of use to information retrieval systems (Salton & McGill, 1983) that retrieve documents in response to user queries and to systems that use texts to construct domain-specific ontologies or thesauri (Ruge, 1997).

Download Full-text