ASPECT BASED SENTIMENT ANALYSIS DATA KUESIONER DI RUMAH SAKIT MUHAMMADIYAH LAMONGAN MENGGUNAKAN ALGORITMA K-NN.

Kesulitan untuk mengorganisir data kuesioner yang bersifat konvensional melatarbelakangi penelitian ini. Oleh karena itu dibuat sistem yang memudahkan pengelompokan data kuesioner secara otomatis yang lengkap dengan sentimen yang terkandung didalamnya. Dataset yang digunakan dalam penelitian ini adalah data kuesioner rumah sakit Muhammadiyah lamongan. Penelitian ini hanya menangani kuesioner yang berbentuk teks. Data dengan fisik kertas direkap kemudian diinput ke database lengkap dengan kategori unit kerja dan sentiment. Selanjutnya dataset tersebut di dilakukan pre-prosesing yang meliputi penanganan negasi case folding, tokenizing, filtering dan stemming. Sebagai data uji komentar dari kuesioner akan dilakukan pre-prosesing selanjutnya dihitung tingkat kemiripan document dengan menggunakan metode K- Nearest Neighbor dan Vector Space Model. Jumlah data yang ditangani mempengaruhi performa system terutama dari akurasi dan kecepatan pada saat proses klasifikasi. Hasil dari sistem yang dibuat berupa ranking dokumen yang paling mirip dengan dataset berdasarkan urutan nilai cosine similarity. Ujicoba klasifikasi berdasarkan kelas kategori menghasilkan nilai akurasi 91 %. Ujicoba berdasarkan Kelas Sentimen sebesar 94 %.dari kombinasi keduanya system berhasil mendapat akurasi sebesar 86 %

Download Full-text

A Comprehensive Comparative Study Using Vector Space Model with K-Nearest Neighbor on Text Categorization Data

Asian Journal of Information Management ◽

10.3923/ajim.2008.14.22 ◽

2007 ◽

Vol 2 (1) ◽

pp. 14-22 ◽

Cited By ~ 1

Author(s):

Wa`el Musa Hadi ◽

Fadi Thabtah ◽

Salahideen Mousa ◽

Samer Al Hawari ◽

Ghassan Kanaan ◽

...

Keyword(s):

Comparative Study ◽

Vector Space ◽

Text Categorization ◽

Nearest Neighbor ◽

Vector Space Model ◽

K Nearest Neighbor ◽

Space Model

Download Full-text

PEMANFAATAN VECTOR SPACE MODEL

10.31237/osf.io/bjrce ◽

2021 ◽

Author(s):

Sukisno Sukisno

Keyword(s):

Vector Space ◽

Nearest Neighbor ◽

Vector Space Model ◽

K Nearest Neighbor ◽

Space Model

Kajian dalam buku ini bertujuan untuk membantu pengguna dalam melakukan kategorisasi dokumen yang dibutuhkan secara cepat dan akurat. Dengan adanya aplikasi untuk proses kategorisasi dokumen yang menerapkan algoritma stemming Nazief Adriani dan Algoritma K-Nearest Neighbor, maka diharapkan dapat memudahkan dalam mengkategorisasikan dokumen serta mempermudah pengguna dalam mencari dokumen berdasarkan tingkat kemiripan (similarity) antara dokumen uji dan learning document.

Download Full-text

Analisa Judul Skripsi untuk Menentukan Peminatan Mahasiswa Menggunakan Vector Space Model dan Metode K-Nearest Neighbor

IT for Society ◽

10.33021/itfs.v4i2.1182 ◽

2020 ◽

Vol 4 (2) ◽

Author(s):

Dewi Marini Umi Atmaja ◽

Rila Mandala

Keyword(s):

Vector Space ◽

Nearest Neighbor ◽

Vector Space Model ◽

K Nearest Neighbor ◽

Space Model

Sulitnya menentukan klasifikasi judul skrpsi berdasarkan peminatan yang diambil oleh mahasiswa informatika unjani merupakan salah satu permasalahan penting yang dihadapi oleh pihak Jurusan. Tujuan dari penelitian ini yaitu memberikan sebuah penunjang keputusan bagi pihak Jurusan agar setiap judul skripsi yang diajukan oleh mahasiswa sesuai dengan peminatan. Berdasarkan hasil penelitian yang telah dilakukan, model yang dibangun menggunakan algoritma KNN menghasilkan tingkat akurasi yang lebih tinggi jika dibandingkan dengan model yang dibangun menggunakan algoritma VSM. Nilai akurasi tertinggi berdasarkan hasil pengujian pada penelitian ini adalah sebasar 96,85%.

Download Full-text

Myanmar News Retrieval in Vector Space Model using Cosine Similarity Measure

2020 IEEE Conference on Computer Applications(ICCA) ◽

10.1109/icca49400.2020.9022845 ◽

2020 ◽

Author(s):

Hay Man Oo ◽

Win Pa Pa

Keyword(s):

Vector Space ◽

Similarity Measure ◽

Vector Space Model ◽

Cosine Similarity ◽

Space Model ◽

Cosine Similarity Measure ◽

News Retrieval

Download Full-text

Automating case definitions using literature-based reasoning

Applied Clinical Informatics ◽

10.4338/aci-2013-04-ra-0028 ◽

2013 ◽

Vol 04 (04) ◽

pp. 515-527 ◽

Cited By ~ 4

Author(s):

R. Ball ◽

T. Botsis

Keyword(s):

Vector Space ◽

Semantic Network ◽

Vector Space Model ◽

Classification Performance ◽

Case Definition ◽

Cosine Similarity ◽

Case Definitions ◽

Space Model ◽

Research Activities ◽

Clinical Surveillance

SummaryBackground: Establishing a Case Definition (CDef) is a first step in many epidemiological, clinical, surveillance, and research activities. The application of CDefs still relies on manual steps and this is a major source of inefficiency in surveillance and research.Objective: Describe the need and propose an approach for automating the useful representation of CDefs for medical conditions.Methods: We translated the existing Brighton Collaboration CDef for anaphylaxis by mostly relying on the identification of synonyms for the criteria of the CDef using the NLM MetaMap tool. We also generated a CDef for the same condition using all the related PubMed abstracts, processing them with a text mining tool, and further treating the synonyms with the above strategy. The co-occur-rence of the anaphylaxis and any other medical term within the same sentence of the abstracts supported the construction of a large semantic network. The ‘islands’ algorithm reduced the network and revealed its densest region including the nodes that were used to represent the key criteria of the CDef. We evaluated the ability of the “translated” and the “generated” CDef to classify a set of 6034 H1N1 reports for anaphylaxis using two similarity approaches and comparing them with our previous semi-automated classification approach.Results: Overall classification performance across approaches to producing CDefs was similar, with the generated CDef and vector space model with cosine similarity having the highest accuracy (0.825±0.003) and the semi-automated approach and vector space model with cosine similarity having the highest recall (0.809±0.042). Precision was low for all approaches.Conclusion: The useful representation of CDefs is a complicated task but potentially offers substantial gains in efficiency to support safety and clinical surveillance.Citation: Botsis T, Ball R. Automating case definitions using literature-based reasoning. Appl Clin Inf 2013; 4: 515–527http://dx.doi.org/10.4338/ACI-2013-04-RA-0028

Download Full-text

Information retrieval from heterogeneous data sets using moderated IDF-cosine similarity in vector space model

2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS) ◽

10.1109/icecds.2017.8390174 ◽

2017 ◽

Cited By ~ 1

Author(s):

Bhagyashree Pathak ◽

Niranjan Lal

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Heterogeneous Data ◽

Cosine Similarity ◽

Data Sets ◽

Space Model

Download Full-text

Sentiment analysis of Chinese micro-blog using vector space model

Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2014 Asia-Pacific ◽

10.1109/apsipa.2014.7041745 ◽

2014 ◽

Author(s):

Zhi-Qiang Xiang ◽

Y. X. Zou ◽

Xin Wang

Keyword(s):

Vector Space ◽

Sentiment Analysis ◽

Vector Space Model ◽

Space Model

Download Full-text

Measuring the Level of Plagiarism of Thesis using Vector Space Model and Cosine Similarity Methods

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/662/2/022111 ◽

2019 ◽

Vol 662 ◽

pp. 022111

Author(s):

I Indriyanto ◽

I D Sumitra

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Cosine Similarity ◽

Space Model

Download Full-text

Sentiment Analysis about Large-Scale Social Restrictions in Social Media Twitter Using Algoritm K-Nearest Neighbor

Jurnal Online Informatika ◽

10.15575/join.v6i1.670 ◽

2021 ◽

Vol 6 (1) ◽

pp. 96

Author(s):

Ikhsan Romli ◽

Shanti Prameswari R ◽

Antika Zahrotul Kamalia

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Large Scale ◽

Nearest Neighbor ◽

Cosine Similarity ◽

Manhattan Distance ◽

K Nearest Neighbor ◽

Distance Calculation ◽

K Nearest Neighbor Algorithm ◽

Similarity Distance

Sentiment analysis is a data processing to recognize topics that people talk about and their sentiments toward the topics, one of which in this study is about large-scale social restrictions (PSBB). This study aims to classify negative and positive sentiments by applying the K-Nearest Neighbor algorithm to see the accuracy value of 3 types of distance calculation which are cosine similarity, euclidean, and manhattan distance for Indonesian language tweets about large-scale social restrictions (PSBB) from social media twitter. With the results obtained, the K-Nearest Neighbor accuracy by the Cosine Similarity distance 82% at k = 3, K-Nearest Neighbor by the Euclidean Distance with an accuracy of 81% at k = 11 and K-Nearest Neighbor by Manhattan Distance with an accuracy 80% at k = 5, 7, 9, 11, and 13. So, in this study the K-Nearest Neighbor algorithm with the Cosine Similarity Distance calculation gets the highest point.

Download Full-text

PENENTUAN MULTIPLE MEMBERSHIP DOKUMEN

Majalah Ilmiah UNIKOM ◽

10.34010/miu.v15i2.560 ◽

2017 ◽

Vol 15 (2) ◽

Author(s):

Stephanie Betha R.H

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Cosine Similarity ◽

Space Model ◽

Multiple Membership

Multiple membership merupakan keanggotaan yang dimiliki oleh seseorang pada beberapa komunitas. Multiple membership pada dokumen artinya suatu dokumen dapat mengandung konten dari beberapa jenis kategori. Jenis kategori pada dokumen dapat ditentukan dengan mengukur kemiripan dokumen tersebut dengan kategori yang ada. Vector Space Model adalah suatu model yang digunakan untuk mengukur kemiripan antara suatu dokumen dan suatu query dengan mewakili setiap dokumen dalam sebuah koleksi sebagai sebuah titik dalam ruang vektor. Hasil dari pengukuran kemiripan tersebut merupakan nilai cosine similarity antara vektor query dari dokumen terhadap vektor kategori. Permasalahan yang terjadi adalah suatu pengukuran kemiripan vektor query dokumen, dapat menghasilkan nilai cosine similarity dengan selisih yang kecil antara vektor kategori satu dengan vektor kategori lain. Hal ini menyebabkan kedua vektor kategori tersebut menjadi saling dominan satu sama lain pada dokumen. Oleh karena itu, dibutuhkan suatu nilai batas untuk menentukan kondisi kapan suatu vektor kategori dapat dinyatakan sebagai vektor kategori yang saling dominan. Penetapan nilai batas ini menggunakan K-Means Clustering. Nilai batas ini ditetapkan berdasarkan pengelompokkan nilai jarak antar presentase cosine similarity pada suatu dokumen. Penentuan multiple membership dokumen ini akan dilakukan pada atribut judul dan kata kunci pada dokumen publikasi ilmiah.

Download Full-text