Information Retrieval (IR) and Extracting Associative Rules

2016 ◽  
pp. 713-732
Author(s):  
Asmae Dami ◽  
Mohamed Fakir ◽  
Belaid Bouikhalene

This chapter is located in the intersection of two research themes, namely: Information Retrieval and Knowledge Discovery from texts (Text mining). The purpose of this paper is two-fold: first, it focuses on Information Retrieval (IR) whose purpose is to implement a set of models and systems for selecting a set of documents satisfying user needs in terms of information expressed as a query. An information retrieval system is composed mainly of two processes the representation and retrieval process. The process of representation is called indexing, which allows representation of documents and queries by descriptors, or indexes. These descriptors reflect the contents of documents. The retrieval process consists on the comparison between documents representations and query representation. The second aim of this paper is to discover the relationships between terms (keywords) descriptors of documents in a document database. The correlations (relationships) between terms are extracted by using a technique of the Text mining, mainly association rules.

2014 ◽  
Vol 7 (4) ◽  
pp. 42-62
Author(s):  
Asmae Dami ◽  
Mohamed Fakir ◽  
Belaid Bouikhalene

This paper is located in the intersection of two research themes, namely: Information Retrieval and Knowledge Discovery from texts (Text mining). The purpose of this paper is two-fold: first, it focuses on Information Retrieval (IR) whose purpose is to implement a set of models and systems for selecting a set of documents satisfying user needs in terms of information expressed as a query. An information retrieval system is composed mainly of two processes the representation and retrieval process. The process of representation is called indexing, which allows representation of documents and queries by descriptors, or indexes. These descriptors reflect the contents of documents. The retrieval process consists on the comparison between documents representations and query representation. The second aim of this paper is to discover the relationships between terms (keywords) descriptors of documents in a document database. The correlations (relationships) between terms are extracted by using a technique of the Text mining, mainly association rules.


Author(s):  
Mounira Chkiwa ◽  
Anis Jedidi ◽  
Faiez Gargouri

In this paper, the authors present an overall description of their information retrieval system which makes a practical collaboration between Semantic Web and Fuzzy logic in order to have profit from their advantages in the information retrieval domain. Their system is dedicated for kids, for this reason the semantic/fuzzy collaboration materialized must be in the background of the information retrieval process because such category of users cannot certainly control semantic web technologies neither fuzzy logic commands. In this paper, the authors present the different services proposed by their system and how they use Semantic Web and Fuzzy logic to develop it. Evaluation tests of the system using universal measures show clearly its efficiency.


2019 ◽  
Author(s):  
Thiago Ferraz ◽  
Gabriel Ferreira ◽  
Fábio Cozman ◽  
Ismael Santos

Classifying sentences in industrial, technical or scientific reports can enhance text mining and information retrieval tasks with useful machinereadable metadata. This paper describes a search engine that employs sentence classification so as to search for abstracts from scholarly papers in Petroleum Engineering. The sentences were classified into four classes, based on the popular IMRAD categories. We produced a dataset containing more than 2,200 manually labeled sentences from 278 scholarly articles in the field of Petroleum Engineering in order to be used as training and testing data. The classifier with best results was logistic regression, with an accuracy of 86.4%. The information retrieval system built on top of the classification system yielded a mAP of 0.80.


2014 ◽  
Vol 2 (1) ◽  
pp. 73-85 ◽  
Author(s):  
Mohamed Néji ◽  
Ali Wali ◽  
Adel M. Alimi

The author's research focuses on the problem of Information Retrieval System (IRS) that integrates the human emotion recognition. This system must be able to recognize the degree of satisfaction of the user for the result found through its facial expression, its physiological state, its gestures and its voice. This paper is an algorithm for recognizing the emotional state of a user during a search session in order to issue the relevant documents that the user needs. The authors also present the architecture agent of the envisaged system and the organizational model.


Author(s):  
Mounira Chkiwa ◽  
Anis Jedidi ◽  
Faiez Gargouri

In this paper, the authors present an overall description of their information retrieval system which makes a practical collaboration between Semantic Web and Fuzzy logic in order to have profit from their advantages in the information retrieval domain. Their system is dedicated for kids, for this reason the semantic/fuzzy collaboration materialized must be in the background of the information retrieval process because such category of users cannot certainly control semantic web technologies neither fuzzy logic commands. In this paper, the authors present the different services proposed by their system and how they use Semantic Web and Fuzzy logic to develop it. Evaluation tests of the system using universal measures show clearly its efficiency.


Author(s):  
Sri Wahyuni

ABSTRACT Introduction One of the efforts to provide the best service for users is by developing innovative library services. One of them is by developing a video content-based library collection. MMTC Yogyakarta Multi Media College Library has developed a video content-based information retrieval system. It is hoped that by utilizing this video content-based STKI, users will be helped and get accelerated information in finding the material needed, especially searching for material in video files. Data Collection Method. In this paper the writer uses qualitative research with a library research approach, while the data analysis uses content analysis techniques. This method the authors use to observe and analyze an information system. Results and Discussions. In developing a Content Based Video Retrieval strategy in the MMTC Yogyakarta Multi Media High School Library, it begins with identifying user needs, creating a system design, evaluating the system design, pouring the system design into a programming language, testing the system, evaluating the system and using it. Then, the authors also provide an overview of the development of the STKI by conducting a SWOT analysis. Based on the macro analysis, the opportunity and threat variables will be formulated, while the internal analysis will formulate the strength and weakness variables. The last stage is the STKI analysis, while the stages are: complete definition, problem analysis, needs analysis, logic design and needs analysis. Conclusions. In the Content Based Video Retrieval development strategy at the MMTC Yogyakarta Multi Media College Library, there are several things that need to be considered in the development of an information retrieval system, including: User needs, development budget (budget), human resources, support from leaders and facilities (software and hardware) and IT infrastructure (internet network). The development of the STKI should begin with identifying user needs and conducting a SWOT analysis to determine the strengths and weaknesses of the system, as well as the goal so that the system can be optimally empowered by users. Keywords: Library, Information Retrieval System, Video Content


Author(s):  
Gouranga Charan Jena ◽  
Siddharth Swarup Rautaray

<p><span>Stemmer is used for reducing inflectional or derived word to its stem. This technique involves removing the suffix or prefix affixed in a word. It can be used for information retrieval system to refine the overall execution of the retrieval process. This process is not equivalent to morphological analysis. This process only finds the stem of a word. This technique decreases the number of terms in information retrieval system. There are various techniques exists for stemming. In this paper, a new web-based stemmer has been proposed named as “Mula” for Odia Language. It uses the Hybrid approach (i.e. combination of brute force and suffix removal approach) for Odia language. The new born stemmer is both computationally faster and domain independent. The results are favourable and indicate that the proposed stemmer can be used effectively in Odia Information Retrieval systems. This stemmer also handles the problem of over-stemming and under-stemming in some extend.</span></p>


Stemmer is used for reducing inflectional or derived word to its stem. This technique involves removing the suffix or prefix affixed in a word. It can be used for information retrieval system to refine the overall execution of the retrieval process. This process is not equivalent to morphological analysis. This process only finds the stem of a word. This technique decreases the number of terms in information retrieval system. There are various techniques exists for stemming. Here a new hybrid stemmer has developed named as “Mula” for Odia Language. It is a combination of brute force and enhanced suffix strippingapproach for Odia language. The new born stemmer is both computationally inexpensive and domain independent. We have integrated this stemmer in existing Dspace for Odia text retrieval System. The results are commendable and suggest that the new stemmer can be used effectively in Odia Search Engine. The proposed stemmer also handles over-stemming and understemming effectively


Author(s):  
Manisha Malhotra ◽  
Aarti Singh

Information Retrieval (IR) is the action of getting the information applicable to a data need from a pool of information resources. Searching can be depends on text indexing. Whenever a client enters an inquiry into the system, an automated information retrieval process becomes starts. Inquiries are formal statements which is required for getting an input (Rijsbergen, 1997). It is not necessary that the given query provides the relevance information. That query matches the result of required information from the database. It doesn't mean it gives the precise and unique result likewise in SQL queries (Rocchio, 2010). Its results are based on the ranking of information retrieved from server. This ranking based technique is the fundamental contrast from database query. It depends on user application the required object can be an image, audio or video. Although these objects are not saved in the IR system, but they can be in the form of metadata. An IR system computes a numeric value of query and then matches it with the ranking of similar objects.


2018 ◽  
Vol 9 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Ram Kumar ◽  
S. C. Sharma

Information Retrieval Systems (IRS) have dramatically changed the ways how people acquire information for their need. Information Retrieval (IR) enables user to find relevant document from collection of countless resources. This article presents an overview of IRS. Objectives of this article is to answer all the basic and specific questions related to IRS. In contrast to other review papers, the authors provide a complete understanding of IR in single paper. Starting from definition and importance it covers retrieval process, performance issues, and comparison among various approaches. This article also includes description of different models along with analysis of their merits and demerits. This article proposes a list of challenges, still unanswered by existing systems. Before offering a conclusion, the major applications of IR are also listed.


Sign in / Sign up

Export Citation Format

Share Document