Web-Based Information Search System Development Using a Semantic Network

Finding information from a large collection of documents is a complicated task; therefore, we need a method called an information retrieval system. Several models that have been used in information retrieval systems include the Vector Space Model (VSM), DICE Similarity, Latent Semantic Indexing (LSI), Generalized Vector Space Model (GVSM), and semantic-based information retrieval systems. The purpose of this study was to develop a semantic network-based search system that will find information based on keywords and the semantic relationship of keywords provided by users. This cannot be done by most search systems that only work based on keyword matching or similarities. The Waterfall development model was used, which divides the development stages into five steps, namely: (1) requirements analysis and definition; (2) system and software design; (3) implementation and unit testing; (4) integration and system testing; and (5) operation and maintenance. The developed system/application was tested by trying to find information based on various combinations of keywords provided by the user. The results showed that the system can find information that matches the keyword, and other relevant information based on the semantic relationships of these keywords. Keywords: information retrieval, search system, semantic network, web-based application

Download Full-text

On Generalized Vector Space Model in Information Retrieval

Fundamenta Informaticae ◽

10.3233/fi-1985-8207 ◽

1985 ◽

Vol 8 (2) ◽

pp. 253-267

Author(s):

S.K.M. Wong ◽

Wojciech Ziarko

Keyword(s):

Information Retrieval ◽

Vector Space ◽

A Priori ◽

Vector Space Model ◽

Smart System ◽

Space Model ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Index Terms ◽

Minimal Modification

In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The main difficulty with this approach is that the explicit representation of term vectors is not known a priori. For this reason, the vector space model adopted by Salton for the SMART system treats the terms as a set of orthogonal vectors. In such a model it is often necessary to adopt a separate, corrective procedure to take into account the correlations between terms. In this paper, we propose a systematic method (the generalized vector space model) to compute term correlations directly from automatic indexing scheme. We also demonstrate how such correlations can be included with minimal modification in the existing vector based information retrieval systems.

Download Full-text

Temu Kembali Informasi pada Soal Ujian dengan Rencana Pembelajaran Menggunakan Vector Space Model

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i1.2739 ◽

2021 ◽

Vol 5 (1) ◽

pp. 63-68

Author(s):

Amalia Beladinna Arifa ◽

Gita Fadila Fitriana ◽

Ananda Rifkiy Hasan

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Space Model ◽

Learning Plan ◽

Model Method ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Calculation Results ◽

The Subject

One way to find out the quality of exam questions is by looking at the rules for writing exam questions made based on the subject or discussion contained in the learning plan document. Therefore, the exam questions that are arranged must be adjusted to the main material in each subject learning achievement. This study discusses the implementation of the concept in information retrieval systems using the Vector Space Model method. The Vector Space Model method has an advantage in query matching because it is able to match only part of the query with existing documents. In addition, the Vector Space Model method is also easy to adapt by adjusting parameters, including weighting parameters. The weighting calculation for each term that appears in the document uses TF-IDF. The purpose of this study is to design an information retrieval system to find the suitability of the exam question query with the subject contained in the learning plan document. The suitability is sorted based on the similarity value of the calculation results, from the largest value to the smallest value in the form of a percentage.

Download Full-text

Analysis of a Vector Space Model, Latent Semantic Indexing and Formal Concept Analysis for Information Retrieval

Cybernetics and Information Technologies ◽

10.2478/cait-2012-0003 ◽

2012 ◽

Vol 12 (1) ◽

pp. 34-48 ◽

Cited By ~ 11

Author(s):

Ch. Aswani Kumar ◽

M. Radvansky ◽

J. Annapurna

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Formal Concept Analysis ◽

Vector Space Model ◽

Latent Semantic Indexing ◽

Concept Analysis ◽

Formal Concept ◽

Semantic Indexing ◽

Space Model ◽

Classical Vector

Abstract Latent Semantic Indexing (LSI), a variant of classical Vector Space Model (VSM), is an Information Retrieval (IR) model that attempts to capture the latent semantic relationship between the data items. Mathematical lattices, under the framework of Formal Concept Analysis (FCA), represent conceptual hierarchies in data and retrieve the information. However, both LSI and FCA use the data represented in the form of matrices. The objective of this paper is to systematically analyze VSM, LSI and FCA for the task of IR using standard and real life datasets.

Download Full-text

Optimising the Heuristics in Latent Semantic Indexing for Effective Information Retrieval

Journal of Information & Knowledge Management ◽

10.1142/s0219649206001359 ◽

2006 ◽

Vol 05 (02) ◽

pp. 97-105 ◽

Cited By ~ 3

Author(s):

S. Srinivas ◽

Ch. AswaniKumar

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Latent Semantic Indexing ◽

Semantic Indexing ◽

Retrieval Performance ◽

Term Weighting ◽

Space Model ◽

Rank Approximation

Latent Semantic Indexing (LSI) is a famous Information Retrieval (IR) technique that tries to overcome the problems of lexical matching using conceptual indexing. LSI is a variant of vector space model and proved to be 30% more effective. Many studies have reported that good retrieval performance is related to the use of various retrieval heuristics. In this paper, we focus on optimising two LSI retrieval heuristics: term weighting and rank approximation. The results obtained demonstrate that the LSI performance improves significantly with the combination of optimised term weighting and rank approximation.

Download Full-text

Aplikasi Sistem Temu Kembali Angket Mahasiswa Menggunakan Metode Generalized Vector Space Model

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2019611184 ◽

2019 ◽

Vol 6 (1) ◽

pp. 33 ◽

Cited By ~ 1

Author(s):

Suprianto Suprianto ◽

Abdul Fadlil ◽

Sunardi Sunardi

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Web Based ◽

Administrative Services ◽

Space Model ◽

Existing Problems ◽

Student Assessments ◽

Student Questionnaire ◽

The Given

Banyak hal yang dapat dilakukan untuk memajukan sebuah perguruan tinggi, salah satunya adalah dengan melakukan evaluasi terhadap angket Mahasiswa pada setiap semester. Salah satu perguruan tinggi yang ada di Kota Tarakan Kalimantan Utara adalah STMIK PPKIA Tarakanita Rahmawati. Banyaknya data yang terdapat pada angket mahasiswa PPKIA membuat pengguna kesulitan menemukan informasi yang sesuai dengan kata kunci yang diberikan. Angket mahasiswa berisi penilaian mahasiswa terhadap pengajaran dosen, pelayanan adminitrasi dan fasilitas kampus yang dibuat dalam bentuk form yaitu memilih grade nilai dari sangat tidak baik sampai dengan sangat baik. Selain itu juga terdapat penilaian dalam bentuk esay yaitu berupa saran dan komentar. Pengisian angket dilakukan pada akhir semester berjalan. Penelitian ini bertujuan untuk menemukan informasi data angket yang relevan terhadap kata kunci. Aplikasi dibangun berbasis web dengan bahasa pemrograman PHP. Aplikasi yang dibuat hanya menggunakan basisdata masih mempunyai kekurangan yaitu tidak dapat mengurutkan dokumen sesuai dengan kata kunci, dikarenakan pengurutan dokumen hanya berdasarkan urutan dokumen pada basisdata saja. Dengan memanfaatkan teknik information retrieval (IR) yang diterapkan pada aplikasi, pengguna akan sangat terbantu dalam menemukan informasi yang dibutuhkan. Aplikasi yang dibuat dapat menampilkan dan mengurutkan dokumen yang paling mirip dengan kata kunci. Aplikasi dibangun dengan metode Generalized Vector Space Model (GVSM) sebagai dasar untuk menyelesaikan permasalahan yang ada. Metode GVSM adalah IR atau biasa disebut sistem temu kembali untuk mencocokkan term atau kata dari kata kunci yang digunakan. Dari hasil uji coba terhadap 5 kata kunci diperoleh nilai precision sebesar 72% dan recall sebesar 100% dengan waktu proses selama 34.4 detik AbstractMany things can be done to advance a university, one of which is by evaluating the Student questionnaire every period. One of the universities in Tarakan City, North Kalimantan is STMIK PPKIA Tarakanita Rahmawati. The large amount of data contained in PPKIA student questionnaires makes it difficult for users to find information that matches the given keywords. Student questionnaires contain student assessments of lecturer teaching, administrative services and campus facilities that are made in the form of selecting grades from very bad to very good. In addition there are also assessments in the form of essays, in the form of suggestions and comments. The questionnaire will be filled in at the end of the period. This study aims to find questionnaire data information that is relevant to the keyword. The application is built web-based with the PHP programming language. Applications that are made using only databases still have a disadvantage of not being able to sort documents according to keywords, because sorting documents is only based on the order of documents on the database only. By utilizing information retrieval (IR) techniques that are applied to the application, users will be very helpful in finding the information needed. The application created can display and sort documents that are most similar to keywords. Applications are built with the Generalized Vector Space Model (GVSM) method as a basis for solving existing problems. The GVSM method is IR or commonly called a retrieval system to match the terms or words of the keywords used. From the results of trials on 5 keywords, the precision value of 72% and recall of 100% were obtained with a processing time of 34.4 seconds.

Download Full-text

Comparing and combining the effectiveness of latent semantic indexing and the ordinary vector space model for information retrieval

Information Processing & Management ◽

10.1016/0306-4573(89)90100-3 ◽

1989 ◽

Vol 25 (6) ◽

pp. 665-676 ◽

Cited By ~ 31

Author(s):

Karen E. Lochbaum ◽

Lynn A. Streeter

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Latent Semantic Indexing ◽

Semantic Indexing ◽

Space Model

Download Full-text

Aplikasi Deteksi Kemiripan Tugas Paper

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v15i2.39 ◽

2017 ◽

Vol 15 (2) ◽

pp. 5

Author(s):

Anthony Anggrawan ◽

Azhari

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Mean Average Precision ◽

Average Precision ◽

Information Searching ◽

Space Model ◽

Model Method

Information searching based on users’ query, which is hopefully able to find the documents based on users’ need, is known as Information Retrieval. This research uses Vector Space Model method in determining the similarity percentage of each student’s assignment. This research uses PHP programming and MySQL database. The finding is represented by ranking the similarity of document with query, with mean average precision value of 0,874. It shows how accurate the application with the examination done by the experts, which is gained from the evaluation with 5 queries that is compared to 25 samples of documents. If the number of counted assignments has higher similarity, thus the process of similarity counting needs more time, it depends on the assignment’s number which is submitted.

Download Full-text

A relational vector-space model of information retrieval adapted to images

ACM SIGIR Forum ◽

10.1145/1067268.1067292 ◽

2005 ◽

Vol 39 (1) ◽

pp. 62-62

Author(s):

Jean Martinet

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Space Model

Download Full-text

The Performance of Boolean Retrieval and Vector Space Model in Textual Information Retrieval

CommIT (Communication and Information Technology) Journal ◽

10.21512/commit.v11i1.2108 ◽

2017 ◽

Vol 11 (1) ◽

pp. 33 ◽

Cited By ~ 1

Author(s):

Budi Yulianto ◽

Widodo Budiharto ◽

Iman Herwidiana Kartowisastro

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Experimental Results ◽

Inverted Index ◽

Exact Results ◽

Textual Information ◽

Space Model ◽

Corpus Data

Boolean Retrieval (BR) and Vector Space Model (VSM) are very popular methods in information retrieval for creating an inverted index and querying terms. BR method searches the exact results of the textual information retrieval without ranking the results. VSM method searches and ranks the results. This study empirically compares the two methods. The research utilizes a sample of the corpus data obtained from Reuters. The experimental results show that the required times to produce an inverted index by the two methods are nearly the same. However, a difference exists on the querying index. The results also show that the numberof generated indexes, the sizes of the generated files, and the duration of reading and searching an index are proportional with the file number in the corpus and thefile size.

Download Full-text

Automatic natural acquisition of a semantic network for information retrieval systems

10.1117/12.56895 ◽

1992 ◽

Author(s):

Chantal Enguehard ◽

Pierre Malvache ◽

Philippe Trigano

Keyword(s):

Information Retrieval ◽

Semantic Network ◽

Retrieval Systems ◽

Information Retrieval Systems

Download Full-text