The Research and Implementation for Vertical Search Engine of Automobile Information

Vertical Search Engine provides a professional search compared with the traditional search engine. All of the data searched by vertical search engine is relative with some one theme, which is decided by users. Usually Vector Space Model is used for judging the relativity between data in the web and the decided theme. But when elements of the theme appear repeatedly, their order is not considered by Vector Space Model. Adding a new element, the Evolved Vector Space Model is provided. The experiments show that the new model has fixed the problem and have a better performance in judging relativity.

Download Full-text

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v1i2.372 ◽

2015 ◽

Vol 1 (2) ◽

Cited By ~ 2

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Semantic Relatedness ◽

Space Model

Download Full-text

A pilot study of a predicate-based vector space model for a biomedical search engine

2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) ◽

10.1109/bibmw.2011.6112537 ◽

2011 ◽

Cited By ~ 1

Author(s):

Myungjae Kwak ◽

G. Leroy ◽

J. D. Martinez

Keyword(s):

Pilot Study ◽

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Space Model

Download Full-text

SISTEM TEMU KEMBALI INFORMASI PADA DOKUMEN DENGAN METODE VECTOR SPACE MODEL

Jurnal Ilmiah FIFO ◽

10.22441/fifo.v9i1.1444 ◽

2017 ◽

Vol 9 (1) ◽

pp. 74

Author(s):

Irmawati Irmawati

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Space Model

Informasi saat ini sangat mudah didapatkan dengan memanfaatkan fasilitas internet dimanapun dan kapanpun. Di sisi lain informasi yang didapat dari search engine merupakan semua hal yang berkaitan dengan kata kunci yang dicari. Hal ini menyebabkan pengguna terpaksa menyaring untuk mendapatkan dokumen yang relevan. Oleh karena itu diperlukan cara untuk mengelompokkan banyaknya informasi yang tersedia, yang dibutuhkan pengguna sehingga memudahkan pengguna untuk mendapatkan dokumen yang diinginkan. Pada penelitian ini diusulkan suatu solusi dari permasalahan tersebut dengan mengembangkan metode ilmu pencarian yang dikenal dengan temu-kembali informasi (information retrieval) dan metode Vector Space Model (VSM). Pada metode Vector Space Model (VSM) beberapa dokumen online akan diindeks dan diurutkan berdasarkan bobot dari kata pencarian yang terdapat di dalam dokumen online tersebut. Salah satu algoritma pembobotannya adalah algoritma tf-idf yang dipengaruhi oleh frekuensi kemunculan kata pada tiap dokumen online dan frekuensi dari dokumen online yang memiliki kata tersebut.

Download Full-text

Research on Content Analysis Algorithm of Focused Crawler Based on LBTF-IDF

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.971-973.1722 ◽

2014 ◽

Vol 971-973 ◽

pp. 1722-1725

Author(s):

Jun Luo ◽

You Li Lu ◽

Chen Xi Lin

Keyword(s):

Content Analysis ◽

Correlation Analysis ◽

Vector Space ◽

Calculation Method ◽

Vector Space Model ◽

Analysis Method ◽

Analysis Algorithm ◽

Space Model ◽

Weight Calculation ◽

Correlation Analysis Method

This paper focuses on the correlation analysis method based on vector space model. In the case of dual classification, this paper made a Joint comparison to find the most appropriate method of selecting featured items for the focused crawler; and then made special effort on analysis and verification of LBTF-IDF algorithm in which the weight calculation method has been improved.

Download Full-text

SISTEM TEMU KEMBALI INFORMASI PADA DOKUMEN DENGAN METODE VECTOR SPACE MODEL

Jurnal Ilmiah FIFO ◽

10.22441/fifo.2017.v9i1.009 ◽

2017 ◽

Vol 9 (1) ◽

pp. 74

Author(s):

Irmawati Irmawati

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Space Model

Informasi saat ini sangat mudah didapatkan dengan memanfaatkan fasilitas internet dimanapun dan kapanpun. Di sisi lain informasi yang didapat dari search engine merupakan semua hal yang berkaitan dengan kata kunci yang dicari. Hal ini menyebabkan pengguna terpaksa menyaring untuk mendapatkan dokumen yang relevan. Oleh karena itu diperlukan cara untuk mengelompokkan banyaknya informasi yang tersedia, yang dibutuhkan pengguna sehingga memudahkan pengguna untuk mendapatkan dokumen yang diinginkan. Pada penelitian ini diusulkan suatu solusi dari permasalahan tersebut dengan mengembangkan metode ilmu pencarian yang dikenal dengan temu-kembali informasi (information retrieval) dan metode Vector Space Model (VSM). Pada metode Vector Space Model (VSM) beberapa dokumen online akan diindeks dan diurutkan berdasarkan bobot dari kata pencarian yang terdapat di dalam dokumen online tersebut. Salah satu algoritma pembobotannya adalah algoritma tf-idf yang dipengaruhi oleh frekuensi kemunculan kata pada tiap dokumen online dan frekuensi dari dokumen online yang memiliki kata tersebut.

Download Full-text

Development and evaluation of a biomedical search engine using a predicate-based vector space model

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2013.07.006 ◽

2013 ◽

Vol 46 (5) ◽

pp. 929-939 ◽

Cited By ~ 8

Author(s):

Myungjae Kwak ◽

Gondy Leroy ◽

Jesse D. Martinez ◽

Jeffrey Harwell

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Space Model

Download Full-text

Rancang Bangun Tabloid Online Bestari dengan Fitur Pencarian berbasis Search Engine Teknologi menggunakan Metode Vector Space Model

Repositor ◽

10.22219/repositor.v2i5.62 ◽

2020 ◽

Vol 2 (5) ◽

pp. 611

Author(s):

Mentari Mas'ama Safitri ◽

Nur Hayatin ◽

Yufis Azhar

Keyword(s):

College Students ◽

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Academic Community ◽

Cosine Similarity ◽

Space Model ◽

Search Feature

AbstrakBestari merupakan lembaga pers mahasiswa yang merupakan media utama untuk menyuarakan dan mendokumentasikan berbagai kegiatan yang dilakukan sivitas akademika Universitas Muhammadiyah Malang. Bestari juga memliki tabloid online yang dapat diakses oleh mahasiswa, namun tabloid online Bestari Universitas Muhammadiyah Malang saat ini belum memiliki fitur pencarian, sehingga pengguna kesulitan untuk mendapatkan informasi sesuai dengan yang diinginkan. Berdasarkan masalah tersebut pembangunan aplikasi Tabloid Online Bestari Universitas Muhammadiyah Malang dengan fitur pencarian berbasis search engine ini bertujuan untuk memberikan kemudahan kepada pengguna khususnya mahasiswa Universitas Muhammadiyah Malang dalam melakukan pencarian.Pada studi kasus ini metode Vector Space Model digunakan untuk memodelkan kumpulan berita dan keyword dari user dalam bentuk vektor yang telah di beri bobot dengan menggunakan metode pembobotan TF-IDF, kemudian akan di hitung kedekatan dari masing-masing dokumen dengan keyword dari user menggunakan cosine similarity. AbstractBestari is collage students’ press agency that is the main media to show their opinion and document any activities that have already done by academic community of UMM. Bestari also has online tabloid that can be accessed by college students, but Bestari tabloid UMM does not currently have a search feature so the user difficult to get information needed. According to this problem, the developing of Tabloid Online Bestari UMM application that completed search feature with search engine based is purposed to give solution to the users especially for colledge students of UMM. This study used Vector Space Model (VSM) method to collect the news and keyword from the users in the form of vector that has been given scale used TF-IDF method, then the correlation of keyword with the document will be counted by using cosine similarity

Download Full-text

Weighted inverse document frequency and vector space model for hadith search engine

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i2.pp1004-1014 ◽

2020 ◽

Vol 18 (2) ◽

pp. 1004

Author(s):

Septya Egho Pratama ◽

Wahyudin Darmalaksana ◽

Dian Sa'adillah Maylawati ◽

Hamdan Sugilar ◽

Teddy Mantoro ◽

...

Keyword(s):

Vector Space ◽

Search Engine ◽

Islamic Law ◽

Vector Space Model ◽

Vector Form ◽

Inverse Document Frequency ◽

Space Model ◽

Document Frequency ◽

Reliable Source ◽

Structured Representation

Hadith is the second source of Islamic law after Qur’an which make many types and references of hadith need to be studied. However, there are not many Muslims know about it and many even have difficulties in studying hadiths. This study aims to build a hadith search engine from reliable source by utilizing Information Retrieval techniques. The structured representation of the text that used is Bag of Word (1-term) with the Weighted Inverse Document Frequency (WIDF) method to calculate the frequency of occurrence of each term before being converted in vector form with the Vector Space Model (VSM). Based on the experiment results using 380 texts of hadith, the recall value of WIDF and VSM is 96%, while precision value is just around 35.46%. This is because the structured representation for text that used is bag of words (1-gram) that can not maintain the meaning of text well).

Download Full-text

Improving Scalability of Java Archive Search Engine through Recursion Conversion And Multithreading

CommIT (Communication and Information Technology) Journal ◽

10.21512/commit.v10i1.1653 ◽

2016 ◽

Vol 10 (1) ◽

pp. 15 ◽

Cited By ~ 1

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Processing Time ◽

Vector Space Model ◽

Semantic Relatedness ◽

Low Rank ◽

Space Model ◽

Before And After ◽

Strongly Connected ◽

Rank Vector

Based on the fact that bytecode always exists on Java archive, a bytecode based Java archive search engine had been developed [1, 2]. Although this system is quite effective, it still lack of scalability since many modules apply recursive calls and this system only utilizes one core (single thread). In this research, Java archive search engine architecture is redesigned in order to improve its scalability. All recursion are converted to iterative forms although most of these modules are logically recursive and quite difficult to convert (e.g. Tarjan’s strongly connected component algorithm). Recursion conversion can be conducted by following its respective recursive pattern. Each recursion is broke down to four parts (before and after actions of current and its children) and converted to iteration with the help of caller reference. This conversion mechanism improves scalability by avoiding stack overflow error caused by method calls. System scalability is also improved by applying multithreading mechanism which successfully cut off its processing time. Shorter processing time may enable system to handle larger data. Multithreading is applied on major parts which are indexer, vector space model (VSM) retriever, low-rank vector space model (LRVSM) retriever, and semantic relatedness calculator (semantic relatedness calculator also involves multiprocess). The correctness of both recursion conversion and multithread design are proved by the fact that all implementation yield similar result.

Download Full-text