Matrix-Based Method for Inferring Elements in Data Attributes Using a Vector Space Model

Boolean Retrieval (BR) and Vector Space Model (VSM) are very popular methods in information retrieval for creating an inverted index and querying terms. BR method searches the exact results of the textual information retrieval without ranking the results. VSM method searches and ranks the results. This study empirically compares the two methods. The research utilizes a sample of the corpus data obtained from Reuters. The experimental results show that the required times to produce an inverted index by the two methods are nearly the same. However, a difference exists on the querying index. The results also show that the numberof generated indexes, the sizes of the generated files, and the duration of reading and searching an index are proportional with the file number in the corpus and thefile size.

Download Full-text

The phrase-based vector space model for automatic retrieval of free-text medical documents

Data & Knowledge Engineering ◽

10.1016/j.datak.2006.02.008 ◽

2007 ◽

Vol 61 (1) ◽

pp. 76-92 ◽

Cited By ~ 42

Author(s):

Wenlei Mao ◽

Wesley W. Chu

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Free Text ◽

Space Model ◽

Automatic Retrieval ◽

Medical Documents

Download Full-text

Modeling Virtual Footprints

International Journal of Agent Technologies and Systems ◽

10.4018/jats.2011040101 ◽

2011 ◽

Vol 3 (2) ◽

pp. 1-17

Author(s):

Rajiv Kadaba ◽

Suratna Budalakoti ◽

David DeAngelis ◽

K. Suzanne Barber

Keyword(s):

Social Network ◽

Vector Space ◽

Semantic Similarity ◽

Vector Space Model ◽

Mapping Algorithm ◽

Space Model ◽

Single Entity ◽

Characteristic Behavior ◽

On Line ◽

The Web

Entities interacting on the web establish their identity by creating virtual personas. These entities, or agents, can be human users or software-based. This research models identity using the Entity-Persona Model, a semantically annotated social network inferred from the persistent traces of interaction between personas on the web. A Persona Mapping Algorithm is proposed which compares the local views of personas in their social network referred to as their Virtual Signatures, for structural and semantic similarity. The semantics of the Entity-Persona Model are modeled by a vector space model of the text associated with the personas in the network, which allows comparison of their Virtual Signatures. This enables all the publicly accessible personas of an entity to be identified on the scale of the web. This research enables an agent to identify a single entity using multiple personas on different networks, provided that multiple personas exhibit characteristic behavior. The agent is able to increase the trustworthiness of on-line interactions by establishing the identity of entities operating under multiple personas. Consequently, reputation measures based on on-line interactions with multiple personas can be aggregated and resolved to the true singular identity.

Download Full-text

Modeling Virtual Footprints

Theoretical and Practical Frameworks for Agent-Based Systems ◽

10.4018/978-1-4666-1565-6.ch007 ◽

2012 ◽

pp. 96-113

Author(s):

Rajiv Kadaba ◽

Suratna Budalakoti ◽

David DeAngelis ◽

K. Suzanne Barber

Keyword(s):

Social Network ◽

Vector Space ◽

Semantic Similarity ◽

Vector Space Model ◽

Mapping Algorithm ◽

Space Model ◽

Single Entity ◽

Characteristic Behavior ◽

On Line ◽

The Web

Entities interacting on the web establish their identity by creating virtual personas. These entities, or agents, can be human users or software-based. This research models identity using the Entity-Persona Model, a semantically annotated social network inferred from the persistent traces of interaction between personas on the web. A Persona Mapping Algorithm is proposed which compares the local views of personas in their social network referred to as their Virtual Signatures, for structural and semantic similarity. The semantics of the Entity-Persona Model are modeled by a vector space model of the text associated with the personas in the network, which allows comparison of their Virtual Signatures. This enables all the publicly accessible personas of an entity to be identified on the scale of the web. This research enables an agent to identify a single entity using multiple personas on different networks, provided that multiple personas exhibit characteristic behavior. The agent is able to increase the trustworthiness of on-line interactions by establishing the identity of entities operating under multiple personas. Consequently, reputation measures based on on-line interactions with multiple personas can be aggregated and resolved to the true singular identity.

Download Full-text

An Improved Method of Judging the Theme Relativity Based on Vector Space Model in Vertical Search Engine

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.106 ◽

2013 ◽

Vol 411-414 ◽

pp. 106-109 ◽

Cited By ~ 1

Author(s):

Ya Heng Ren

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Improved Method ◽

New Model ◽

Space Model ◽

Vertical Search ◽

Vertical Search Engine ◽

The Web

Vertical Search Engine provides a professional search compared with the traditional search engine. All of the data searched by vertical search engine is relative with some one theme, which is decided by users. Usually Vector Space Model is used for judging the relativity between data in the web and the decided theme. But when elements of the theme appear repeatedly, their order is not considered by Vector Space Model. Adding a new element, the Evolved Vector Space Model is provided. The experiments show that the new model has fixed the problem and have a better performance in judging relativity.

Download Full-text

Natural Language Processing Based Question Answering Using Vector Space Model

Advances in Intelligent Systems and Computing - Proceedings of Sixth International Conference on Soft Computing for Problem Solving ◽

10.1007/978-981-10-3325-4_37 ◽

2017 ◽

pp. 368-375 ◽

Cited By ~ 1

Author(s):

R. Jayashree ◽

N. Niveditha

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Vector Space ◽

Language Processing ◽

Question Answering ◽

Vector Space Model ◽

Space Model

Download Full-text

Contextual weighting approach to compute term weight in layered vector space model

Journal of Information Science ◽

10.1177/0165551519860043 ◽

2019 ◽

pp. 016555151986004

Author(s):

Jayant Gadge ◽

Sunil Bhirud

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Primary Concern ◽

New Approach ◽

Space Model ◽

Web Document ◽

Web Information ◽

Unique Approach ◽

The Web

The World Wide Web (WWW) is the largest available repository of information. This huge amount of information put forward the challenges of retrieval of trustworthy information from WWW. It defies researchers with new issues of diversity and complexity while retrieving the web information. Information retrieval from the web demands approaches that span beyond conventional information retrieval. Heterogeneity, complexity and the huge volume of web information requires a unique approach to retrieve information. Besides, end-users introduce some difficulties in the retrieval process. Sometimes queries submitted by the user are subtle and ambiguous. The primary concern in information retrieval is the issue of predicting the relevance of documents. In this article, a new approach is proposed that rationally separates web document into five layers, namely, title, header, hyperlink, meta tag and body layer. The proposed method effectively combines the textual information and structural evidence of web document for retrieving information from Web. In the proposed layered vector space model, each layer has an allocated priority which is used to compute weight factor for these layers. The proposed method deduces equation that effectively combines priority of the layer and length of the layer to calculate the weight of the layer.

Download Full-text

The web software mining based on vector space model

2009 First International Conference on Future Information Networks ◽

10.1109/icfin.2009.5339603 ◽

2009 ◽

Cited By ~ 1

Author(s):

Feijie Wang ◽

Zhongying

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model ◽

Software Mining ◽

The Web

Download Full-text

An Improved Shark Search Algorithm Based on Domain Ontology

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.651-653.2252 ◽

2014 ◽

Vol 651-653 ◽

pp. 2252-2257

Author(s):

Zhi Qiang Li ◽

Yuan Tan ◽

Hong Chen Guo ◽

Chong Feng

Keyword(s):

Vector Space ◽

State Of The Art ◽

Search Algorithm ◽

Vector Space Model ◽

Domain Ontology ◽

The State ◽

Experimental Results ◽

Space Model ◽

Ontology Model ◽

Evaluation Metric

In recent years, the prevailing topic crawler algorithms are concentrated on the contents of topical words. These existing approaches neglect the sematic relationship among textual concepts, which lead to low correlation between crawled webpages. To address the issue, this paper presents a deep analysis of Shark Search algorithm, and makes an optimization in terms of incorporating the characteristics associated with semi-structured webpages. Furthermore, we enhance the performance of vector space model utilized in Shark Search algorithm by virtue of domain ontology, and propose a standardized method based on the vector space of ontology model to improve the evaluation metric of TF-IDF. The experimental results demonstrate the effectiveness of our algorithm that outperforms the state-of-the-art significantly in precision and recall.

Download Full-text

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v1i2.372 ◽

2015 ◽

Vol 1 (2) ◽

Cited By ~ 2

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Semantic Relatedness ◽

Space Model

Download Full-text