vector space model Latest Research Papers

Sebuah berita terkait suatu informasi yang beredar di media cetak atau mainstream akan menjadikan opini publik tentang suatu masalah baik yang bersifat informasi positif atau negatif, perkembangan teknologi informasi sekarang ini menyebabkan penyebaran informasi bisa uptodate setiap harinya. Dengan semakin mudahnya sebuah informasi menyebar maka akan semakin mudah pula mempengaruhi kehidupan dalam sosial masyarakat sekarang ini. Namun pada kenyataannya informasi yang beredar di media itu tidak semuanya benar atau bisa dikatakan adanya suatu berita hoax atau tidak benar. Dalam penelitian ini bertujuan untuk mengklasifikasi sistem temu kembali informasi berita hoaks menggunakan metode vektor space model untuk memastikan kebenaran suatu berita apakah berita hoax atau tidak. Dalam penelitian tersebut menghasilkan klasifikasi kebenaran berita dengan akurasi terbaik pada K-6 sebesar 83%, artinya dengan akurasi tersebut bisa memvalidasi klasifikasi terkait informasi berita benar ataupun hoax sebesar 83%.

Download Full-text

VECTOR SPACE MODELS OF KYIV CITY PETITIONS

Information, Computing and Intelligent systems ◽

10.20535/2708-4930.2.2021.244188 ◽

2021 ◽

Author(s):

Roman Shaptala ◽

Gennadiy Kyselov

Keyword(s):

Dimensionality Reduction ◽

Vector Space ◽

Vector Space Model ◽

Vector Spaces ◽

Space Model ◽

Advantages And Disadvantages ◽

Vector Space Models

In this study, we explore and compare two ways of vector space model creation for Kyiv city petitions. Both models are built on top of word vectors based on the distributional hypothesis, namely Word2Vec and FastText. We train word vectors on the dataset of Kyiv city petitions, preprocess the documents, and apply averaging to create petition vectors. Visualizations of the vector spaces after dimensionality reduction via UMAP are demonstrated in an attempt to show their overall structure. We show that the resulting models can be used to effectively query semantically related petitions as well as search for clusters of related petitions. The advantages and disadvantages of both models are analyzed.

Download Full-text

Leveraging medical context to recommend semantically similar terms for chart reviews

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01724-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Cheng Ye ◽

Bradley A. Malin ◽

Daniel Fabbri

Keyword(s):

Vector Space ◽

Chart Review ◽

Characteristic Curve ◽

Vector Space Model ◽

Language Models ◽

Supervised Machine Learning ◽

Keyword Query ◽

Space Model ◽

Medical Context ◽

Context Vector

Abstract Background Information retrieval (IR) help clinicians answer questions posed to large collections of electronic medical records (EMRs), such as how best to identify a patient’s cancer stage. One of the more promising approaches to IR for EMRs is to expand a keyword query with similar terms (e.g., augmenting cancer with mets). However, there is a large range of clinical chart review tasks, such that fixed sets of similar terms is insufficient. Current language models, such as Bidirectional Encoder Representations from Transformers (BERT) embeddings, do not capture the full non-textual context of a task. In this study, we present new methods that provide similar terms dynamically by adjusting with the context of the chart review task. Methods We introduce a vector space for medical-context in which each word is represented by a vector that captures the word’s usage in different medical contexts (e.g., how frequently cancer is used when ordering a prescription versus describing family history) beyond the context learned from the surrounding text. These vectors are transformed into a vector space for customizing the set of similar terms selected for different chart review tasks. We evaluate the vector space model with multiple chart review tasks, in which supervised machine learning models learn to predict the preferred terms of clinically knowledgeable reviewers. To quantify the usefulness of the predicted similar terms to a baseline of standard word2vec embeddings, we measure (1) the prediction performance of the medical-context vector space model using the area under the receiver operating characteristic curve (AUROC) and (2) the labeling effort required to train the models. Results The vector space outperformed the baseline word2vec embeddings in all three chart review tasks with an average AUROC of 0.80 versus 0.66, respectively. Additionally, the medical-context vector space significantly reduced the number of labels required to learn and predict the preferred similar terms of reviewers. Specifically, the labeling effort was reduced to 10% of the entire dataset in all three tasks. Conclusions The set of preferred similar terms that are relevant to a chart review task can be learned by leveraging the medical context of the task.

Download Full-text

PEMANFAATAN VECTOR SPACE MODEL

10.31237/osf.io/bjrce ◽

2021 ◽

Author(s):

Sukisno Sukisno

Keyword(s):

Vector Space ◽

Nearest Neighbor ◽

Vector Space Model ◽

K Nearest Neighbor ◽

Space Model

Kajian dalam buku ini bertujuan untuk membantu pengguna dalam melakukan kategorisasi dokumen yang dibutuhkan secara cepat dan akurat. Dengan adanya aplikasi untuk proses kategorisasi dokumen yang menerapkan algoritma stemming Nazief Adriani dan Algoritma K-Nearest Neighbor, maka diharapkan dapat memudahkan dalam mengkategorisasikan dokumen serta mempermudah pengguna dalam mencari dokumen berdasarkan tingkat kemiripan (similarity) antara dokumen uji dan learning document.

Download Full-text

Determination of Sentence Similarity Level Using Vector Space Model (VSM) and Word Relationship Weighting for Plagiarism Detection for Indonesian Documents

10.1109/databia53375.2021.9650177 ◽

2021 ◽

Author(s):

Rizal Wahyudi ◽

Muhammad Zarlis ◽

Syahril Efendi

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Plagiarism Detection ◽

Space Model ◽

Sentence Similarity

Download Full-text

Patent Analysis Using Vector Space Model and Deep Learning Model : A Case of Artificial Intelligence Industry Technology

10.20944/preprints202111.0208.v1 ◽

2021 ◽

Author(s):

Yongmin Yoo ◽

Dongjin Lim ◽

Kyungsun Kim

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Vector Space ◽

Rapid Development ◽

Vector Space Model ◽

Learning Model ◽

Patent Data ◽

Space Model ◽

Artificial Intelligence Technology ◽

Deep Learning Model

Thanks to rapid development of artificial intelligence technology in recent years, the current artificial intelligence technology is contributing to many part of society. Education, environment, medical care, military, tourism, economy, politics, etc. are having a very large impact on society as a whole. For example, in the field of education, there is an artificial intelligence tutoring system that automatically assigns tutors based on student's level. In the field of economics, there are quantitative investment methods that automatically analyze large amounts of data to find investment laws to create investment models or predict changes in financial markets. As such, artificial intelligence technology is being used in various fields. So, it is very important to know exactly what factors have an important influence on each field of artificial intelligence technology and how the relationship between each field is connected. Therefore, it is necessary to analyze artificial intelligence technology in each field. In this paper, we analyze patent documents related to artificial intelligence technology. We propose a method for keyword analysis within factors using artificial intelligence patent data sets for artificial intelligence technology analysis. This is a model that relies on feature engineering based on deep learning model named KeyBERT, and using vector space model. A case study of collecting and analyzing artificial intelligence patent data was conducted to show how the proposed model can be applied to real-world problems.

Download Full-text

INFORMATION RETRIEVAL PADA DATA JUDUL SKRIPSI BERBASIS TEXT MENGGUNAKAN VECTOR SPACE MODEL

Jurnal Ilmu Komputer ◽

10.33060/jik/2021/vol10.iss2.230 ◽

2021 ◽

Vol 10 (2) ◽

Author(s):

Eka Sabna

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Vector Model ◽

Space Vector ◽

Space Model

Penyimpanan data judul skripsi mahasiswa semakin banyak dan akan terus bertambah. Untuk mencari informasi dari judul skripsi tersebut akan menjadi sulit. Untuk itu dikembangkanlah metode pencarian yang disebut dengan temu-kembali informasi (information retrieval). Metode-metode temu-kembali informasi sudah dikenal sejak lama, salah satu dari metode tersebut yang paling banyak digunakan karena kemudahan implementasinya adalah Space Vector Model (SVM). Tujuan penelitian ini adalah memberikan paparan tentang proses pencarian dokumen digital dengan metode Vektor Space Model. Pada model ini dilakukan dengan proses token dan indexing sehingga ditemukan hasil dari maksimal terdapat dalam data judul skripsi menggunakan kata kunci, sehingga di lakukan pencarian sesuai dengan kata kunci dan akan dibandingkan dengan data yang terdapat pada file dokumen judul skripsi, sehingga dapat menghasilkan informasi yang benar.

Download Full-text

TF-IDF Method and Vector Space Model Regarding the Covid-19 Vaccine on Online News

SinkrOn ◽

10.33395/sinkron.v6i1.11179 ◽

2021 ◽

Vol 6 (1) ◽

pp. 69-79

Author(s):

Bita Parga Zen ◽

Irwan Susanto ◽

Dian Finaliamartha

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Relevant Information ◽

Online News ◽

Basic Form ◽

Information Redundancy ◽

Search System ◽

Space Model ◽

Use Of The Internet ◽

Online News Sites

Advances in information and technology have caused the use of the internet to be a concern of the general public. Online news sites are one of the technologies that have developed as a means of disseminating the latest information in the world. When viewed in terms of numbers, newsreaders are very sufficient to get the desired information. However, with this, the amount of information collected will result in an explosion of information and the possibility of information redundancy. The search system is one of the solutions which expected to help in finding the desired or relevant information by the input query. The methods commonly used in this case are TF-IDF and VSM (Vector Space Model) which are used in weighting to measure statistics from a collection of documents on the search for some information about the Covid 19 vaccine on kompas.com news then tokenizing it to separate the text, stopword removal or filtering to remove unnecessary words which usually consist of conjunctions and others. The next step is sentence stemming which aims to eliminate word inflection to its basic form. Then the TF-IDF and VSM calculations were carried out and the final result are news documents 3 (DOC 3) with a weight of 5.914226424; news documents 2 (DOC 2) with a weight of 1.767692186; news documents 5 (DOC 5) with weights 1.550165096; news document 4 (DOC 4) with a weight of 1.17141223;, and the last is news document 1 (DOC 1) with a weight of 0.5244103739.

Download Full-text

Document Search in Information Retrieval System Using Vector Space Model

10.1109/iceeie52663.2021.9616735 ◽

2021 ◽

Author(s):

Yusrandi ◽

Muladi ◽

Harits Ar Rosyid ◽

Abd Kadir Mahamad

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Retrieval System ◽

Vector Space Model ◽

Information Retrieval System ◽

Space Model ◽

Document Search

Download Full-text

vector space model
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Optimization of Plagiarism Detection using Vector Space Model on CUDA Architecture

Klasifikasi Berita Hoaks Menggunakan Algoritma Vector Space Model

VECTOR SPACE MODELS OF KYIV CITY PETITIONS

Leveraging medical context to recommend semantically similar terms for chart reviews

PEMANFAATAN VECTOR SPACE MODEL

Determination of Sentence Similarity Level Using Vector Space Model (VSM) and Word Relationship Weighting for Plagiarism Detection for Indonesian Documents

Patent Analysis Using Vector Space Model and Deep Learning Model : A Case of Artificial Intelligence Industry Technology

INFORMATION RETRIEVAL PADA DATA JUDUL SKRIPSI BERBASIS TEXT MENGGUNAKAN VECTOR SPACE MODEL

TF-IDF Method and Vector Space Model Regarding the Covid-19 Vaccine on Online News

Document Search in Information Retrieval System Using Vector Space Model

Export Citation Format

vector space modelRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Optimization of Plagiarism Detection using Vector Space Model on CUDA Architecture

Klasifikasi Berita Hoaks Menggunakan Algoritma Vector Space Model

VECTOR SPACE MODELS OF KYIV CITY PETITIONS

Leveraging medical context to recommend semantically similar terms for chart reviews

PEMANFAATAN VECTOR SPACE MODEL

Determination of Sentence Similarity Level Using Vector Space Model (VSM) and Word Relationship Weighting for Plagiarism Detection for Indonesian Documents

Patent Analysis Using Vector Space Model and Deep Learning Model : A Case of Artificial Intelligence Industry Technology

INFORMATION RETRIEVAL PADA DATA JUDUL SKRIPSI BERBASIS TEXT MENGGUNAKAN VECTOR SPACE MODEL

TF-IDF Method and Vector Space Model Regarding the Covid-19 Vaccine on Online News

Document Search in Information Retrieval System Using Vector Space Model

vector space model
Recently Published Documents