Information Retrieval and Text Analysis

Author(s):  
W. JOHN HUTCHINS
Author(s):  
ELSAYED ATLAM

Conventional approaches to text analysis and information retrieval which measured document similarity by considering all information in texts are relatively inefficiency for processing large text collections in heterogeneous subject areas. Previous researches showed that evidence from passage can improve retrieval results. But it also raised questions about how passage is defined, how they can be ranked efficiently, and what is their proper rule in long structure documents. Moreover, the frequency of "the" with important sentence is efficiently to summarize the text by dexterity way. We previously proposed an approach for extracting sentences which including article "the" by some restrict rules to carry out effectiveness passages. Based on previous approaches, this paper presents a new Passage SIMilarity (P-SIM) measurements between documents based on effectiveness passages after extracting them using article "the". Moreover, our new approach showing that this method is more efficient than traditional methods. Also, Recall and Precision are achieved by 92.6% and 97.5% respectively, depending on extracted passages. Furthermore, Recall and Precision significantly improved by 38.3% and 44.2% over the traditional method. The proposed methods are applied to 3,990 articles from the large tagged corpus.


Author(s):  
D. A. Ilvovsky ◽  
◽  
B. A. Galitsky ◽  
◽  

In this paper we learn how to manage a dialogue relying on discourse of its utterances. We consider two complementary approaches of dialogue management based on the discourse text analysis to extend the abilities of the interactive information retrieval-based chat bot.


Bi-lingual text analysis is competent in present scenario as the information gathered in various languages is flattering. The bi-lingual text classification is yet an obscure area whereas the text classification in a single language is well known. The concept of bi-lingual text has been left in a shell, apart from the lame stream of both theory as well as practical. The use of social media is increasing day by day and thus the amount of data too in increasing with a rapid rate. So, it is an alarming stage to analyze the big data and extract the useful information. In this paper, we are developing a dynamic information retrieval model and extricating the sentiments of people on global warming of English and Italian tweets and corresponding to it its heat map and affinity map are generated as it produces the output after harmonizing different objects which diverge in the rung of relevancy to the question


2019 ◽  
Author(s):  
Matthew J. Lavin

This lesson focuses on a foundational natural language processing and information retrieval method called Term Frequency - Inverse Document Frequency (tf-idf). This lesson explores the foundations of tf-idf, and will also introduce you to some of the questions and concepts of computationally oriented text analysis.


Author(s):  
Richard E. Hartman ◽  
Roberta S. Hartman ◽  
Peter L. Ramos

We have long felt that some form of electronic information retrieval would be more desirable than conventional photographic methods in a high vacuum electron microscope for various reasons. The most obvious of these is the fact that with electronic data retrieval the major source of gas load is removed from the instrument. An equally important reason is that if any subsequent analysis of the data is to be made, a continuous record on magnetic tape gives a much larger quantity of data and gives it in a form far more satisfactory for subsequent processing.


Author(s):  
Hilton H. Mollenhauer

Many factors (e.g., resolution of microscope, type of tissue, and preparation of sample) affect electron microscopical images and alter the amount of information that can be retrieved from a specimen. Of interest in this report are those factors associated with the evaluation of epoxy embedded tissues. In this context, informational retrieval is dependant, in part, on the ability to “see” sample detail (e.g., contrast) and, in part, on tue quality of sample preservation. Two aspects of this problem will be discussed: 1) epoxy resins and their effect on image contrast, information retrieval, and sample preservation; and 2) the interaction between some stains commonly used for enhancing contrast and information retrieval.


Sign in / Sign up

Export Citation Format

Share Document