Broadening Vector Space Schemes for Improving the Quality of Information Retrieval

Author(s):  
Kotagiri Ramamohanarao ◽  
Laurence A. F. Park
2016 ◽  
Vol 78 (5-6) ◽  
Author(s):  
Jasman Pardede ◽  
Milda Gustiana Husada

Vector space model (VSM) is an Information Retrieval (IR) system model that represents query and documents as n-dimension vector. GVSM is an expansion from VSM that represents the documents base on similarity value between query and minterm vector space of documents collection. Minterm vector is defined by the term in query. Therefore, in retrieving a document can be done base on word meaning inside the query. On the contrary, a document can consist the same information semantically. LSI is a method implemented in IR system to retrieve document base on overall meaning of users’ query input from a document, not based on each word translation. LSI uses a matrix algebra technique namely Singular Value Decomposition (SVD). This study discusses the performance of VSM, GVSM and LSI that are implemented on IR to retrieve Indonesian sentences document of .pdf, .doc and .docx extension type files, by using Nazief and Adriani stemming algorithm. Each method implemented either by thread or no-thread. Thread is implemented in preprocessing process in reading each document from document collection and stemming process either for query or documents. The quality of information retrieval performance is evaluated based-on time response, values of recall, precision, and F-measure were measured. The results show that for each method, the fastest execution time is .docx extension type file followed by .doc and .pdf. For the same document collection, the results show that time response for LSI is more faster, followed by GVSM then VSM. The average of recall value for VSM, GVSM and LSI are 82.86 %, 89.68 % and 84.93 % respectively. The average of precision value for VSM, GVSM and LSI are 64.08 %, 67.51 % and 62.08 % respectively. The average of F-measure value for VSM, GVSM and LSI are 71.95 %, 76.63 % and 71.02 % respectively. Implementation of multithread for preprocessing for VSM, GVSM, and LSI can increase average time response required is about 30.422%, 26.282%, and 31.821% respectively.  


2019 ◽  
Author(s):  
rusda wajhillah ◽  
Agung Wibowo ◽  
Saeful Bahri

The quality of research needs to be directed and classified for improvement. A college roadmap must accordance interest and expertise from it lecturers. Therefore, be the duty of every college to create a strategic plan and pre-eminent research. Faculty in most all College has produced many scientific publications. Publication document of scientific papers is one example of unstructured documents. Its contents form of writing style, mostly defined by the author language. Generally, the document title only determined the maximum number of words. The main objective of the information retrieval system is to determine the documents keywords from the query provided by the user in a group of documents. TF/IDF Algorithm (Term Frequency – Inversed Document Frequency) and the Vector Space Model algorithm is several methods of the algorithm that can utilize on text mining in analysing phases as options document classification determination-based solutions words that often appear on the document title. This paper can help decision makers to determine, assess, adapt research roadmap to College. The depiction of a tree model using long-term roadmap makes it easier to read and understand. [Kualitas penelitian perlu diarahkan dan diklasifikasikan untuk perbaikan. Roadmap perguruan tinggi harus sesuai dengan minat dan keahlian dari dosen. Karena itu, jadilah tugas setiap perguruan tinggi untuk membuat rencana strategis dan penelitian unggulan. Fakultas - fakultas di hampir semua perguruan tinggi telah menghasilkan banyak publikasi ilmiah. Dokumen publikasi karya ilmiah adalah salah satu contoh dokumen tidak terstruktur. Isinya berupa gaya penulisan, sebagian besar ditentukan oleh bahasa penulis. Secara umum, judul dokumen hanya menentukan jumlah kata maksimum. Tujuan utama dari sistem pencarian informasi adalah untuk menentukan kata kunci dokumen dari permintaan yang diberikan oleh pengguna dalam sekelompok dokumen. Algoritma TF / IDF (TermFrequency - Inversed Document Frequency) dan algoritma Vector Space Model adalah beberapa metode algoritma yang dapat digunakan pada penambangan teks dalam menganalisis fase sebagai opsi dokumen klasifikasi penentuan kata-kata solusi berdasarkan solusi yang sering muncul pada judul dokumen. Makalah ini dapat membantu para pembuat keputusan untuk menentukan, menilai, mengadaptasi peta jalan penelitian ke perguruan tinggi. Penggambaran model pohon menggunakan peta jalan jangka panjang membuatnya lebih mudah dibaca dan dipahami.]


Author(s):  
Hilton H. Mollenhauer

Many factors (e.g., resolution of microscope, type of tissue, and preparation of sample) affect electron microscopical images and alter the amount of information that can be retrieved from a specimen. Of interest in this report are those factors associated with the evaluation of epoxy embedded tissues. In this context, informational retrieval is dependant, in part, on the ability to “see” sample detail (e.g., contrast) and, in part, on tue quality of sample preservation. Two aspects of this problem will be discussed: 1) epoxy resins and their effect on image contrast, information retrieval, and sample preservation; and 2) the interaction between some stains commonly used for enhancing contrast and information retrieval.


2019 ◽  
Vol 27 (2) ◽  
pp. 119-133
Author(s):  
Putri Aprilia Isnaini ◽  
Ida Bagus Nyoman Udayana

This writing is done to determine the effect of information quality and service quality on attitudes in the use of application systems with the ease of use of the system as an intervining variable in online transportation services (gojek) in Yogyakarta. The sample in this study is customers who use online motorcycle transportation services in Yogyakarta. The sampling technique uses accidental sampling technique. Data collection is done by distributing online questionnaires through the Goegle form and distributed with social media such as WhatsApp and Instagram on a 1-4 scale to measure 4 indicators. The results of this study show 1) the quality of information affects the ease of use, 2) the quality of service affects the ease of use, 3) the quality of information influences attitudes in use, 4) the quality of services does not affect attitudes in use, and 5) ease of use attitude in use.


2020 ◽  
Vol 28 (1) ◽  
pp. 44
Author(s):  
Johar Arifin ◽  
Ilyas Husti ◽  
Khairunnas Jamal ◽  
Afriadi Putra

This article aims to explain maqâṣid al-Qur’ân according to M. Quraish Shihab and its application in interpreting verses related to the use of social media. The problem that will be answered in this article covers two main issues, namely how the perspective of maqâṣid al-Qur’ân according to M. Quraish Shihab and how it is applied in interpreting the verses of the use of social media. The method used is the thematic method, namely discussing verses based on themes. Fr om this study the authors concluded that according to M. Quraish Shihab there are six elements of a large group of universal goals of the al-Qur’ân, namely strengthening the faith, humans as caliphs, unifying books, law enforcement, callers to the ummah of wasathan, and mastering world civilization. The quality of information lies in the strength of the monotheistic dimension which is the highest peak of the Qur’anic maqâṣid. M. Quraish Shihab offers six diction which can be done by recipients of information in interacting on social media. Thus, it aims to usher in the knowledge and understanding of what is conveyed in carrying out human mission as caliph, enlightenment through oral and written, law enforcement, unifying mankind and the universe to the ummah of wasathan, and mastery of world civilization


2020 ◽  
Vol 6 (3) ◽  
pp. 158-164
Author(s):  
Navruza Yakhyayeva ◽  

The quality and content of information in the article media text is based on scientific classification of linguistic features. The study of functional styles of speech, the identification of their linguistic signs, the discovery of the functional properties of linguistic units and their separation on the basis of linguistic facts is one of thetasks that modern linguistics is waiting for a solution. Text Linguistics, which deals with the creation, modeling of its structure and the study of the process of such activity, is of interest to journalists today as a science.


Sign in / Sign up

Export Citation Format

Share Document