A New Ensemble Clustering Approach for Effective Information Retrieval

Author(s):  
Archana Maruthavanan ◽  
Ayyasamy Ayyanar
2020 ◽  
pp. 016555152091159
Author(s):  
Muhammad Qasim Memon ◽  
Yu Lu ◽  
Penghe Chen ◽  
Aasma Memon ◽  
Muhammad Salman Pathan ◽  
...  

Text segmentation (TS) is the process of dividing multi-topic text collections into cohesive segments using topic boundaries. Similarly, text clustering has been renowned as a major concern when it comes to multi-topic text collections, as they are distinguished by sub-topic structure and their contents are not associated with each other. Existing clustering approaches follow the TS method which relies on word frequencies and may not be suitable to cluster multi-topic text collections. In this work, we propose a new ensemble clustering approach (ECA) is a novel topic-modelling-based clustering approach, which induces the combination of TS and text clustering. We improvised a LDA-onto (LDA-ontology) is a TS-based model, which presents a deterioration of a document into segments (i.e. sub-documents), wherein each sub-document is associated with exactly one sub-topic. We deal with the problem of clustering when it comes to a document that is intrinsically related to various topics and its topical structure is missing. ECA is tested through well-known datasets in order to provide a comprehensive presentation and validation of clustering algorithms using LDA-onto. ECA exhibits the semantic relations of keywords in sub-documents and resultant clusters belong to original documents that they contain. Moreover, present research sheds the light on clustering performances and it indicates that there is no difference over performances (in terms of F-measure) when the number of topics changes. Our findings give above par results in order to analyse the problem of text clustering in a broader spectrum without applying dimension reduction techniques over high sparse data. Specifically, ECA provides an efficient and significant framework than the traditional and segment-based approach, such that achieved results are statistically significant with an average improvement of over 10.2%. For the most part, proposed framework can be evaluated in applications where meaningful data retrieval is useful, such as document summarization, text retrieval, novelty and topic detection.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
XiaoYong Li ◽  
Yong Zhang ◽  
Huimin Cheng ◽  
Feifei Zhou ◽  
BaoCai Yin

Author(s):  
Banage T. G. S. Kumara ◽  
Incheon Paik ◽  
Koswatte R. C. Koswatte

With the large number of web services now available via the internet, service discovery, recommendation, and selection have become a challenging and time-consuming task. Organizing services into similar clusters is a very efficient approach. A principal issue for clustering is computing the semantic similarity. Current approaches use methods such as keyword, information retrieval, or ontology-based methods. These approaches have problems that include discovering semantic characteristics, loss of semantic information, and a shortage of high-quality ontologies. Thus, the authors present a method that first adopts ontology learning to generate ontologies via the hidden semantic patterns existing within complex terms. Then, they propose service recommendation and selection approaches based on proposed clustering approach. Experimental results show that the term-similarity approach outperforms comparable existing clustering approaches. Further, empirical study of the prototyping recommendation and selection approaches have proved the effectiveness of proposed approaches.


Author(s):  
Richard E. Hartman ◽  
Roberta S. Hartman ◽  
Peter L. Ramos

We have long felt that some form of electronic information retrieval would be more desirable than conventional photographic methods in a high vacuum electron microscope for various reasons. The most obvious of these is the fact that with electronic data retrieval the major source of gas load is removed from the instrument. An equally important reason is that if any subsequent analysis of the data is to be made, a continuous record on magnetic tape gives a much larger quantity of data and gives it in a form far more satisfactory for subsequent processing.


Author(s):  
Hilton H. Mollenhauer

Many factors (e.g., resolution of microscope, type of tissue, and preparation of sample) affect electron microscopical images and alter the amount of information that can be retrieved from a specimen. Of interest in this report are those factors associated with the evaluation of epoxy embedded tissues. In this context, informational retrieval is dependant, in part, on the ability to “see” sample detail (e.g., contrast) and, in part, on tue quality of sample preservation. Two aspects of this problem will be discussed: 1) epoxy resins and their effect on image contrast, information retrieval, and sample preservation; and 2) the interaction between some stains commonly used for enhancing contrast and information retrieval.


Sign in / Sign up

Export Citation Format

Share Document