Ranking of Odia Text Document Relevant to User Query Using Vector Space Model

Analysis of Vector Space Method in Information Retrieval for Smart Answering System

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9099 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4468-4472

Author(s):

Deepa Yogish ◽

T. N. Manjunath ◽

Ravindra S. Hegadi

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Query Term ◽

Frequency Method ◽

Document Ranking ◽

User Intent ◽

Space Model ◽

Relevant Document ◽

User Query

In the world of internet, searching play a vital role to retrieve the relevant answers for the user specific queries. The most promising application of natural language processing and information retrieval system is Question answering system which provides directly the accurate answer instead of set of documents. The main objective of information retrieval is to retrieve relevant document from a huge volume of data sets underlying in the internet using appropriatemodel. There are many models proposed for retrieval process such as Boolean, Vector space and Probabilistic method. Vector space model is best method in information retrieval for document ranking with efficient document representation which combines simplicity and clarity. VSM adopts similarity function to measure the matching between documents and user intent, and assign scores from the biggest to smallest. The documents and query are assigned with weights using term frequency and inverse document frequency method. To retrieve most relevant document to the user query term, document ranking function cosine similarity score is applied for every document and user query. The documents having more similarity scores will be considered as relevant documents to the query term and they are ranked based on these scores. This paper emphasizes on different techniques of information retrieval and Vector Space Model offers a realistic compromise in IR processing. It allows best weighing scheme which ranks the set of documents in order of relevance based on user query.

Get full-text (via PubEx)

Text Document Categorization using Enhanced Sentence Vector Space Model and Bi-Gram Text Representation Model Based on Novel Fusion Techniques

10.7176/nmmc/93-03 ◽

2020 ◽

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Text Representation ◽

Space Model ◽

Model Based ◽

Text Document ◽

Representation Model ◽

Document Categorization

Get full-text (via PubEx)

A survey on text document categorization using enhanced sentence vector space model and bi-gram text representation model based on novel fusion techniques

2018 2nd International Conference on Inventive Systems and Control (ICISC) ◽

10.1109/icisc.2018.8399067 ◽

2018 ◽

Author(s):

Abdisa Demissie Amensisa ◽

Seema Patil ◽

Poorva Agrawal

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Text Representation ◽

Space Model ◽

Model Based ◽

Text Document ◽

Representation Model ◽

Document Categorization

Get full-text (via PubEx)

Best Approximate of Vector Space Model by Using SVD

Al-Mustansiriyah Journal of Science ◽

10.23851/mjs.v28i2.509 ◽

2018 ◽

Vol 28 (2) ◽

pp. 143

Author(s):

Raghad M. Hadi

Keyword(s):

Text Mining ◽

Vector Space ◽

Document Clustering ◽

Vector Space Model ◽

Internet Technology ◽

Low Rank ◽

Space Model ◽

Text Document ◽

Space Technique ◽

Text Mining Application

A quick growth of internet technology makes it easy to assemble a huge volume of data as text document; e. g., journals, blogs, network pages, articles, email letters. In text mining application, increasing text space of datasets represent excessive task which makes it hard to pre-processing documents in efficient way to prepare it for text mining application like document clustering. The proposed system focuses on pre-processing document and reduction document space technique to prepare it for clustering technique. The mutual method for text mining problematic is vector space model (VSM), each term represent a features. Thus the proposed system create vector-space mod-el by using pre-processing method to reduce of trivial data from dataset. While the hug dimen-sionality of VSM is resolved by using low-rank SVD. Experiment results show that the proposed system give better document representation results about 10% from previous approach to prepare it for document clustering

Get full-text (via PubEx)

Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

Computer and Information Science ◽

10.5539/cis.v1n4p79 ◽

2008 ◽

Vol 1 (4) ◽

Cited By ~ 4

Author(s):

Dino Isa ◽

Lee Lam Hong ◽

V. P. Kallimani ◽

R. Rajkumar

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model ◽

Text Document ◽

Bayes Formula

Get full-text (via PubEx)

THE INFLUENCE OF TEXT PREPROCESSING METHODS AND TOOLS ON CALCULATING TEXT SIMILARITY

Facta Universitatis Series Mathematics and Informatics ◽

10.22190/fumi1905973d ◽

2019 ◽

pp. 973

Author(s):

Đorđe Petrović ◽

Milena Stanković

Keyword(s):

Text Mining ◽

Vector Space ◽

Vector Space Model ◽

Text Similarity ◽

Text Documents ◽

Space Model ◽

Text Document ◽

The Subject ◽

Text Preprocessing ◽

Multidimensional Representation

Text mining to a great extent depends on the various text preprocessing techniques. The preprocessing methods and tools which are used to prepare texts for further mining can be divided into those which are and those which are not language-dependent. The subject matter of this research was the analysis of the inﬂuence of these methods and tools on further text mining. We ﬁrst focused on the analysis of the inﬂuence on the reduction of the vector space model for the multidimensional represen-tation of text documents. We then analyzed the inﬂuence on calculating text similarity, which is the focus of this research. The conclusion we reached is that the implemen-tation of various text preprocessing methods in the Serbian language, which are used for the reduction of the vector space model for the multidimensional representation of text document, achieves the required results. But, the implementation of various text preprocessing methods speciﬁc to the Serbian language for the purpose of calculating text similarity can lead to great diﬀerences in the results.

Get full-text (via PubEx)

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v1i2.372 ◽

2015 ◽

Vol 1 (2) ◽

Cited By ~ 2

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Semantic Relatedness ◽

Space Model

Get full-text (via PubEx)

Aplikasi Deteksi Kemiripan Tugas Paper

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v15i2.39 ◽

2017 ◽

Vol 15 (2) ◽

pp. 5

Author(s):

Anthony Anggrawan ◽

Azhari

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Mean Average Precision ◽

Average Precision ◽

Information Searching ◽

Space Model ◽

Model Method

Information searching based on users’ query, which is hopefully able to find the documents based on users’ need, is known as Information Retrieval. This research uses Vector Space Model method in determining the similarity percentage of each student’s assignment. This research uses PHP programming and MySQL database. The finding is represented by ranking the similarity of document with query, with mean average precision value of 0,874. It shows how accurate the application with the examination done by the experts, which is gained from the evaluation with 5 queries that is compared to 25 samples of documents. If the number of counted assignments has higher similarity, thus the process of similarity counting needs more time, it depends on the assignment’s number which is submitted.

Get full-text (via PubEx)

Aplikasi Rekomendasi Buku Pada Katalog Perpustakaan Universitas Multimedia Nusantara Menggunakan Vector Space Model

Jurnal ULTIMATICS ◽

10.31937/ti.v9i2.639 ◽

2018 ◽

Vol 9 (2) ◽

pp. 97-105

Author(s):

Richard Firdaus Oeyliawan ◽

Dennis Gunawan

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Vector Model ◽

Library Management ◽

Space Model ◽

Library Management System ◽

Index Terms ◽

Library Catalogue ◽

Language Sample ◽

F Measure

Library is one of the facilities which provides information, knowledge resource, and acts as an academic helper for readers to get the information. The huge number of books which library has, usually make readers find the books with difficulty. Universitas Multimedia Nusantara uses the Senayan Library Management System (SLiMS) as the library catalogue. SLiMS has many features which help readers, but there is still no recommendation feature to help the readers finding the books which are relevant to the specific book that readers choose. The application has been developed using Vector Space Model to represent the document in vector model. The recommendation in this application is based on the similarity of the books description. Based on the testing phase using one-language sample of the relevant books, the F-Measure value gained is 55% using 0.1 as cosine similarity threshold. The books description and variety of languages affect the F-Measure value gained. Index Terms—Book Recommendation, Porter Stemmer, SLiMS Universitas Multimedia Nusantara, TF-IDF, Vector Space Model

Get full-text (via PubEx)

First Movers and Follow-on Invention: Evidence from a Vector Space Model of Invention

SSRN Electronic Journal ◽

10.2139/ssrn.3354530 ◽

2019 ◽

Cited By ~ 1

Author(s):

Kenneth A. Younge ◽

Jeffrey M. Kuhn

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model

Get full-text (via PubEx)