document cluster Latest Research Papers

DYNAMIC STOP LIST FOR THE GUJARATI LANGUAGE USING RULE BASED APPROACH

Towards Excellence ◽

10.37867/te130152 ◽

2021 ◽

pp. 594-607

Author(s):

Chandrakant D. Patel ◽

Jayeshkumar M. Patel

Keyword(s):

Text Classification ◽

Rule Based ◽

Word Usage ◽

Document Cluster ◽

Area Unit ◽

Rule Based Approach ◽

Gujarati Language

Text classification, document cluster, and similar endeavors area unit important analysis spaces underpinning a large vary of problems and applications within the area of internet intelligence. A typical approach, once cluster or otherwise classifying documents algorithmically, is to represent an online document as a vector of numbers derived from the frequencies of words therein document, wherever the words area unit taken from the complete assortment of documents into consideration. A tried approach is to use an inventory of “stop” words (also known as a ‘stop list’), that establish frequent words like ‘and’ and ‘or’ which might be unlikely to Words within the stop list area unit thus not enclosed within the vector that represents a document. Current stop lists area unit arguably noncurrent within the lightweight of fluctuations in word usage, and conjointly arguably innocent of sure candidate generic stop lists, thus delivery into question that we tend to explore this by developing new stop lists in an exceedingly rule based method.

Download Full-text

A Multibranch Search Tree-Based Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data

Security and Communication Networks ◽

10.1155/2020/7307315 ◽

2020 ◽

Vol 2020 ◽

pp. 1-15

Author(s):

Hua Dai ◽

Xuelong Dai ◽

Xiao Li ◽

Xun Yi ◽

Fu Xiao ◽

...

Keyword(s):

Large Scale ◽

Personal Data ◽

Cloud Service ◽

Search Tree ◽

The Novel ◽

Privacy Concerns ◽

Cloud Data ◽

Document Cluster ◽

Ranked Search ◽

Index Tree

In the interest of privacy concerns, cloud service users choose to encrypt their personal data before outsourcing them to cloud. However, it is difficult to achieve efficient search over encrypted cloud data. Therefore, how to design an efficient and accurate search scheme over large-scale encrypted cloud data is a challenge. In this paper, we integrate bisecting k-means algorithm and multibranch tree structure and propose the α-filtering tree search scheme based on bisecting k-means clusters. The novel index tree is built from bottom-up, and a greedy depth first algorithm is used for filtering the nonrelevant document cluster by calculating the relevance score between the filtering vector and the query vector. The α-filtering tree can improve the efficiency without the loss of search accuracy. The experiment on a real-world dataset demonstrates the effectiveness of our scheme.

Download Full-text

A Novel Chaotic Northern Bald Ibis Optimization Algorithm for Solving Different Cluster Problems [ICCICC18 #155]

International Journal of Software Science and Computational Intelligence ◽

10.4018/ijssci.2019040101 ◽

2019 ◽

Vol 11 (2) ◽

pp. 1-25 ◽

Cited By ~ 2

Author(s):

Ravi Kumar Saidala ◽

Nagaraju Devarakonda

Keyword(s):

Clustering Algorithm ◽

Chaotic Maps ◽

Optimization Technique ◽

Solution Space ◽

Clustering Method ◽

Web Document ◽

Document Cluster ◽

Two Phases ◽

Noa Noa ◽

Numerical Complexity

This article proposes a new optimal data clustering method for finding optimal clusters of data by incorporating chaotic maps into the standard NOA. NOA, a newly developed optimization technique, has been shown to be efficient in generating optimal results with lowest solution cost. The incorporation of chaotic maps into metaheuristics enables algorithms to diversify the solution space into two phases: explore and exploit more. To make the NOA more efficient and avoid premature convergence, chaotic maps are incorporated in this work, termed as CNOAs. Ten different chaotic maps are incorporated individually into standard NOA for testing the optimization performance. The CNOA is first benchmarked on 23 standard functions. Secondly, testing was done on the numerical complexity of the new clustering method which utilizes CNOA, by solving 10 UCI data cluster problems and 4 web document cluster problems. The comparisons have been made with the help of obtaining statistical and graphical results. The superiority of the proposed optimal clustering algorithm is evident from the simulations and comparisons.

Download Full-text

An Improved K-Lion Optimization Algorithm With Feature Selection Methods for Text Document Cluster

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i7.245251 ◽

2018 ◽

Vol 6 (7) ◽

pp. 245-251

Author(s):

Jagatheeshkumar. G ◽

S. Selva Brunda

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Selection Methods ◽

Text Document ◽

Document Cluster ◽

Lion Optimization Algorithm

Download Full-text

Representative Labels Selection Technique for Document Cluster using WordNet

Journal of Internet Computing and services ◽

10.7472/jksii.2017.18.2.61 ◽

2017 ◽

Vol 18 (2) ◽

pp. 61-73 ◽

Cited By ~ 1

Author(s):

Tae-Hoon Kim ◽

Mye Sohn

Keyword(s):

Selection Technique ◽

Document Cluster

Download Full-text

Partially Exclusive Item Partition in MMMs-Induced Fuzzy Co-Clustering and its Effects in Collaborative Filtering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0810 ◽

2015 ◽

Vol 19 (6) ◽

pp. 810-817 ◽

Cited By ~ 7

Author(s):

Katsuhiro Honda ◽

◽

Takaya Nakano ◽

Chi-Hyon Oh ◽

Seiki Ubukata ◽

...

Keyword(s):

Collaborative Filtering ◽

Word Class ◽

Product Analysis ◽

Keyword Analysis ◽

Document Cluster ◽

Analysis Experiment ◽

Constraint Model

The interpretability of fuzzy co-cluster partitions were shown to be improved by introducing exclusive penalties on both object and item memberships although the conventional fuzzy co-clustering adopted exclusive natures only on object memberships. In real applications, however, fully exclusive constraints may bring inappropriate influences to some items, and partially exclusive penalties should be forced reflecting the characteristics of each item. For example, in customer-product analysis, the degree of popularity of each product may be a measure of compatibility in multiple customer groups, and exclusive penalties should be forced only to some specific products. In this paper, the conventional exclusive constraint model is further modified by forcing exclusive penalties only to some selected items, and the effects of partially exclusive partition are demonstrated from the view points of not only partition quality but also collaborative filtering applicability. In a document-keyword analysis experiment, word class is shown to be useful for exclusively selecting keywords so that the interpretability of document cluster is improved. In a collaborative filtering experiment, the recommendation capability is demonstrated to be improved by considering intrinsic differences of popularity of each product.

Download Full-text

Text Document Cluster Analysis Through Visualization of 3D Projections

Studies in Big Data - Data Mining for Service ◽

10.1007/978-3-642-45252-9_15 ◽

2014 ◽

pp. 271-291

Author(s):

Masaki Aono ◽

Mei Kobayashi

Keyword(s):

Cluster Analysis ◽

Text Document ◽

Document Cluster

Download Full-text

Improving hierarchical document cluster labels through candidate term selection

Intelligent Decision Technologies ◽

10.3233/idt-2012-0121 ◽

2011 ◽

Vol 6 (1) ◽

pp. 43-58

Author(s):

Fabiano Fernandes dos Santos ◽

Veronica Oliveira de Carvalho ◽

Solange Oliveira Rezende

Keyword(s):

Term Selection ◽

Candidate Term ◽

Document Cluster

Download Full-text

A Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information

Strategic Advancements in Utilizing Data Mining and Warehousing Technologies ◽

10.4018/978-1-60566-717-1.ch008 ◽

2011 ◽

pp. 133-150

Author(s):

Zhang Xiaodan ◽

Hu Xiaohua ◽

Xia Jiali ◽

Zhou Xiaohua ◽

Achananuparp Palakorn

Keyword(s):

Clustering Algorithm ◽

Document Clustering ◽

Biomedical Literature ◽

Semantic Relationship ◽

Clustering Method ◽

Graph Representations ◽

The Core ◽

Document Cluster ◽

Clustering Approach ◽

Global And Local

In this article, we present a graph-based knowledge representation for biomedical digital library literature clustering. An efficient clustering method is developed to identify the ontology-enriched k-highest density term subgraphs that capture the core semantic relationship information about each document cluster. The distance between each document and the k term graph clusters is calculated. A document is then assigned to the closest term cluster. The extensive experimental results on two PubMed document sets (Disease10 and OHSUMED23) show that our approach is comparable to spherical k-means. The contributions of our approach are the following: (1) we provide two corpus-level graph representations to improve document clustering, a term co-occurrence graph and an abstract-title graph; (2) we develop an efficient and effective document clustering algorithm by identifying k distinguishable class-specific core term subgraphs using terms’ global and local importance information; and (3) the identified term clusters give a meaningful explanation for the document clustering results.

Download Full-text

Document cluster detection on latent projections

2009 Fourth International Conference on Digital Information Management ◽

10.1109/icdim.2009.5356765 ◽

2009 ◽

Author(s):

Dora Alvarez-Medina ◽

Hugo Hidalgo-Silva

Keyword(s):

Cluster Detection ◽

Document Cluster

Download Full-text

document cluster
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

DYNAMIC STOP LIST FOR THE GUJARATI LANGUAGE USING RULE BASED APPROACH

A Multibranch Search Tree-Based Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data

A Novel Chaotic Northern Bald Ibis Optimization Algorithm for Solving Different Cluster Problems [ICCICC18 #155]

An Improved K-Lion Optimization Algorithm With Feature Selection Methods for Text Document Cluster

Representative Labels Selection Technique for Document Cluster using WordNet

Partially Exclusive Item Partition in MMMs-Induced Fuzzy Co-Clustering and its Effects in Collaborative Filtering

Text Document Cluster Analysis Through Visualization of 3D Projections

Improving hierarchical document cluster labels through candidate term selection

A Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information

Document cluster detection on latent projections

Export Citation Format

document clusterRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

DYNAMIC STOP LIST FOR THE GUJARATI LANGUAGE USING RULE BASED APPROACH

A Multibranch Search Tree-Based Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data

A Novel Chaotic Northern Bald Ibis Optimization Algorithm for Solving Different Cluster Problems [ICCICC18 #155]

An Improved K-Lion Optimization Algorithm With Feature Selection Methods for Text Document Cluster

Representative Labels Selection Technique for Document Cluster using WordNet

Partially Exclusive Item Partition in MMMs-Induced Fuzzy Co-Clustering and its Effects in Collaborative Filtering

Text Document Cluster Analysis Through Visualization of 3D Projections

Improving hierarchical document cluster labels through candidate term selection

A Graph-Based Biomedical Literature Clustering Approach Utilizing Term's Global and Local Importance Information

Document cluster detection on latent projections

document cluster
Recently Published Documents