Structural Text Mining

Text Mining ◽

Language Processing ◽

World Wide ◽

Knowledge Bases ◽

Domain Specific ◽

Retrieval Systems ◽

The World ◽

Information Retrieval Systems

The advent of the World Wide Web has resulted in the creation of millions of documents containing unstructured, structured and semi-structured data. Consequently, research on structural text mining has come to the forefront of both information retrieval and natural language processing (Cardie, 1997; Freitag, 1998; Hammer, Garcia-Molina, Cho, Aranha, & Crespo, 1997; Hearst, 1992; Hsu & Chang, 1999; Jacquemin & Bush, 2000; Kushmerick, Weld, & Doorenbos, 1997). Knowledge of how information is organized and structured in texts can be of significant assistance to information systems that use documents as their knowledge bases (Appelt, 1999). In particular, such knowledge is of use to information retrieval systems (Salton & McGill, 1983) that retrieve documents in response to user queries and to systems that use texts to construct domain-specific ontologies or thesauri (Ruge, 1997).

Adding knowledge to information retrieval systems in the World Wide Web

Artificial Intelligence in Medicine - Lecture Notes in Computer Science ◽

10.1007/bfb0029483 ◽

1997 ◽

pp. 491-500

Author(s):

G. Mann ◽

M. Schubert ◽

V. Schaeffler

Keyword(s):

Information Retrieval ◽

World Wide Web ◽

World Wide ◽

Retrieval Systems ◽

The World ◽

Information Retrieval Systems

A Dialectical Approach to Search Engine Evaluation

Libri ◽

10.1515/libri-2019-0142 ◽

2020 ◽

Vol 70 (3) ◽

pp. 227-237

Author(s):

Mahdi Zeynali-Tazehkandi ◽

Mohsen Nowkarizi

Keyword(s):

Information Retrieval ◽

Information Science ◽

Library And Information Science ◽

Related Literature ◽

Philosophical Foundations ◽

Dialectical Approach ◽

Retrieval Systems ◽

The World ◽

Oriented Approach

AbstractEvaluation of information retrieval systems is a fundamental topic in Library and Information Science. The aim of this paper is to connect the system-oriented and the user-oriented approaches to relevant philosophical schools. By reviewing the related literature, it was found that the evaluation of information retrieval systems is successful if it benefits from both system-oriented and user-oriented approaches (composite). The system-oriented approach is rooted in Parmenides’ philosophy of stability (immovable) which Plato accepts and attributes to the world of forms; the user-oriented approach is rooted in Heraclitus’ flux philosophy (motion) which Plato defers and attributes to the tangible world. Thus, using Plato’s theory is a comprehensive approach for recognizing the concept of relevance. The theoretical and philosophical foundations determine the type of research methods and techniques. Therefore, Plato’s dialectical method is an appropriate composite method for evaluating information retrieval systems.

Geographic Information Retrieval and the World Wide Web: A Match Made in Electronic Space

Cartographic Perspectives ◽

10.14714/cp26.717 ◽

1997 ◽

pp. 13-26 ◽

Cited By ~ 2

Author(s):

David Johnson ◽

Myke Gluck

Keyword(s):

Information Retrieval ◽

World Wide Web ◽

Information Needs ◽

World Wide ◽

Data Retrieval ◽

Query Languages ◽

Geographic Information ◽

Geographic Information Retrieval ◽

Retrieval Systems ◽

The World

This article looks at the access to geographic information through a review of information science theory and its application to the WWW. The two most common retrieval systems are information and data retrieval. A retrieval system has seven elements: retrieval models, indexing, match and retrieval, relevance, order, query languages and query specification. The goal of information retrieval is to match the user's needs to the information that is in the system. Retrieval of geographic information is a combination of both information and data retrieval. Aids to effective retrieval of geographic information are: query languages that employ icons and natural language, automatic indexing of geographic information, and standardization of geographic information. One area that has seen an explosion of geographic information retrieval systems (GIR's) is the World Wide Web (WWW). The final section of this article discusses how seven WWW GIR's solve the the problem of matching the user's information needs to the information in the system.

Events Automatic Extraction from Arabic Texts

10.4018/978-1-7998-0951-7.ch078 ◽

2020 ◽

pp. 1686-1704

Author(s):

Emna Hkiri ◽

Souheyl Mallat ◽

Mounir Zrigui

Keyword(s):

Information Retrieval ◽

Text Mining ◽

Machine Translation ◽

Language Processing ◽

Question Answering ◽

Arabic Language ◽

Event Extraction ◽

Mining Machine ◽

Open Domain

The event extraction task consists in determining and classifying events within an open-domain text. It is very new for the Arabic language, whereas it attained its maturity for some languages such as English and French. Events extraction was also proved to help Natural Language Processing tasks such as Information Retrieval and Question Answering, text mining, machine translation etc… to obtain a higher performance. In this article, we present an ongoing effort to build a system for event extraction from Arabic texts using Gate platform and other tools.

Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '88 ◽

Conceptual representation for knowledge bases and > information retrieval systems

10.1145/62437.62497 ◽

1988 ◽

Author(s):

G. P. Zarri

Keyword(s):

Information Retrieval ◽

Knowledge Bases ◽

Conceptual Representation ◽

Retrieval Systems ◽

Intelligent Information Retrieval ◽

Intelligent Information

Bayesian approach to incorporating different types of biomedical knowledge bases into information retrieval systems for clinical decision support in precision medicine

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2019.103238 ◽

2019 ◽

Vol 98 ◽

pp. 103238 ◽

Author(s):

Saeid Balaneshinkordan ◽

Alexander Kotov

Keyword(s):

Information Retrieval ◽

Decision Support ◽

Clinical Decision Support ◽

Bayesian Approach ◽

Clinical Decision ◽

Knowledge Bases ◽

Biomedical Knowledge ◽

Retrieval Systems ◽

Different Types ◽

Information Retrieval Systems

Automatic Keyword Annotation System Using Newspapers

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2014.p0340 ◽

2014 ◽

Vol 18 (3) ◽

pp. 340-346 ◽

Author(s):

Tomoki Takada ◽

◽

Mizuki Arai ◽

Tomohiro Takagi

Keyword(s):

Information Retrieval ◽

Language Processing ◽

High Speed ◽

Naive Bayes ◽

High Accuracy ◽

Naïve Bayes ◽

Annotation System ◽

Retrieval Systems ◽

Index Terms

Nowadays, an increasingly large amount of information exists on the web. Therefore, a method is needed that enables us to find necessary information quickly because this is becoming increasingly difficult for users. To solve this problem, information retrieval systems like Google and recommendation systems like that on Amazon are used. In this paper, we focus on information retrieval systems. These retrieval systems require index terms, which affect the precision of retrieval. Two methods generally decide index terms. One is analyzing a text using natural language processing and deciding index terms using varying amounts of statistics. The other is someone choosing document keywords as index terms. However, the latter method requires too much time and effort and becomes more impractical as information grows. Therefore, we propose the Nikkei annotator system, which is based on the model of the human brain and learns patterns of past keyword annotation and automatically outputs keywords that users prefer. The purposes of the proposed method are automating manual keyword annotation and achieving high speed and high accuracy keyword annotation. Experimental results showed that the proposed method is more accurate than TFIDF and Naive Bayes in P@5 and P@10. Moreover, these results also showed that the proposed method could annotate about 19 times faster than Naive Bayes.

Large-scale image search with text for information retrieval

Journal of Innovations in Engineering Education ◽

10.3126/jiee.v4i1.35390 ◽

2021 ◽

Vol 4 (1) ◽

pp. 87-89

Author(s):

Janardan Bhatta

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Large Scale ◽

Image Feature ◽

Image Search ◽

Search Results ◽

Retrieval Systems ◽

Text Features ◽

Text Query

Searching images in a large database is a major requirement in Information Retrieval Systems. Expecting image search results based on a text query is a challenging task. In this paper, we leverage the power of Computer Vision and Natural Language Processing in Distributed Machines to lower the latency of search results. Image pixel features are computed based on contrastive loss function for image search. Text features are computed based on the Attention Mechanism for text search. These features are aligned together preserving the information in each text and image feature. Previously, the approach was tested only in multilingual models. However, we have tested it in image-text dataset and it enabled us to search in any form of text or images with high accuracy.

Automatic NLP for Competitive Intelligence

Emerging Technologies of Text Mining ◽

10.4018/978-1-59904-373-9.ch003 ◽

2008 ◽

pp. 54-76 ◽

Author(s):

Christian Aranha ◽

Emmanuel Passos

Keyword(s):

Data Mining ◽

Information Retrieval ◽

Text Mining ◽

Language Processing ◽

Entity Recognition ◽

Competitive Intelligence ◽

Reference Process ◽

Processing Information ◽

Mining Algorithms

This chapter integrates elements from Natural Language Processing, Information Retrieval, Data Mining and Text Mining to support competitive intelligence. It shows how text mining algorithms can attend to three important functionalities of CI: Filtering, Event Alerts and Search. Each of them can be mapped as a different pipeline of NLP tasks. The chapter goes in-depth in NLP techniques like spelling correction, stemming, augmenting, normalization, entity recognition, entity classification, acronyms and co-reference process. Each of them must be used in a specific moment to do a specific job. All these jobs will be integrated in a whole system. These will be ‘assembled’ in a manner specific to each application. The reader’s better understanding of the theories of NLP provided herein will result in a better ´assembly´.

Discovering Ranking Functions for Information Retrieval

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch072 ◽

2011 ◽

pp. 377-381

Author(s):

Weiguo Fan ◽

Praveen Pathak

Keyword(s):

Information Retrieval ◽

World Wide ◽

Adaptive Algorithms ◽

Relevant Information ◽

Ranking Function ◽

Ranking Functions ◽

Document Collections ◽

Retrieval Systems ◽

The World ◽

Document Collection

The field of information retrieval deals with finding relevant documents from a large document collection or the World Wide Web in response to a user’s query seeking relevant information. Ranking functions play a very important role in the retrieval performance of such retrieval systems and search engines. A single ranking function does not perform well across different user queries, and document collections. Hence it is necessary to “discover” a ranking function for a particular context. Adaptive algorithms like genetic programming (GP) are well suited for such discovery.