ANALISA TESTIMONIAL DENGAN MENGGUNAKAN ALGORITMA TEXT MINING DAN TERM FREQUENCY- INVERSE DOCUMENT FREQUENCE  (TF-IDF) PADA TOKO ALLMEEART

E-commerce or often referred to as an online shop is the latest trend of the community in carrying out shopping activities, first before the rise of e-commerce companies like today the community to meet their needs still rely on distros around the customer lives, or to a shopping place but now it has switch to shoop online. The advantages offered by online shoop are the relatively low prices, no need to shop locations, and guarantee goods, it has an impact on retail shops that are increasingly lonely. Testimonials are one of the techniques carried out to convince customers to shop at e-commerce they have, testimonials are the responses of buyers for their experience of shopping in an e-commerce application starting from the payment process until the goods are received, the more positive experiences conveyed in the testimonials, the customer who have not shopped on an e-commerce application will be more convinced to shop. Testimonials on an e-commerce application are not always positive, there are times when testimonials are delivered by negative buyers. The customer's problem is the unavailability of percentages or information on the number of buyers with positive and negative shopping experiences because in general testimonials are only delivered in the form of a list.Keywords: Testimonial Analysis, Text Mining Algorithm, Term Frequency-Inverse Document Frequency (TF-IDF)

Download Full-text

Analisis Sentimen Opini Pemindahan Ibu Kota Pada Twitter Dengan Metode Support Vector Machine

Jurnal Ilmu Komputer ◽

10.24843/jik.2021.v14.i01.p06 ◽

2021 ◽

Vol 14 (1) ◽

pp. 49

Author(s):

Tezza Fazar Tri Hidayat ◽

Garno Garno ◽

Azhari Ali Ridha

Keyword(s):

Support Vector Machine ◽

Text Mining ◽

Support Vector ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency

Relokasi ibu kota Indonesia kini telah diresmikan oleh Presiden Joko Widodo pada 26 Agustus 2019 ke Kalimantan, ini adalah sejarah baru dalam sejarah Indonesia karena belum pernah terjadi sebelumnya, sehingga memunculkan banyak pendapat atau tanggapan dari masyarakat. Analisis sentimen adalah kegiatan yang digunakan untuk menganalisis pendapat atau opini seseorang tentang suatu topik. Twitter adalah media sosial yang digunakan untuk mengekspresikan pendapat pengguna dan menyatukannya pada suatu topik. Support Vector Machine adalah metode text mining yang mencakup metode klasifikasi dan Term Frequency - Inverse Document Frequency adalah metode pembobotan karakter. SVM dan TF-IDF dapat digunakan untuk menganalisis sentimen opini publik tentang topik pemindahan ibukota Indonesia. Tujuan dari penelitian ini adalah untuk mengklasifikasikan opini publik tentang topik memindahkan Ibu Kota Indonesia dari ribuan tweet yang telah dikumpulkan dan disaring. Tweet pada dari 22-29 Maret 2020 telah diproses menjadi 992 tweet dan terdiri dari 221 data dengan label positif dan 771 data negatif. Dan menggunakan metode SVM yang memiliki akurasi 77,72% dan dikombinasikan dengan TFIDF yang meningkatkan akurasinya menjadi 78,33%.

Download Full-text

Improve the Accuracy of Support Vector Machine Using Chi Square Statistic and Term Frequency Inverse Document Frequency on Movie Review Sentiment Analysis

Scientific Journal of Informatics ◽

10.15294/sji.v6i1.14244 ◽

2019 ◽

Vol 6 (1) ◽

pp. 138-149

Author(s):

Ukhti Ikhsani Larasati ◽

Much Aziz Muslim ◽

Riza Arifudin ◽

Alamsyah Alamsyah

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Text Mining ◽

Sentiment Analysis ◽

Feature Weighting ◽

Support Vector ◽

Chi Square ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency

Data processing can be done with text mining techniques. To process large text data is required a machine to explore opinions, including positive or negative opinions. Sentiment analysis is a process that applies text mining methods. Sentiment analysis is a process that aims to determine the content of the dataset in the form of text is positive or negative. Support vector machine is one of the classification algorithms that can be used for sentiment analysis. However, support vector machine works less well on the large-sized data. In addition, in the text mining process there are constraints one is number of attributes used. With many attributes it will reduce the performance of the classifier so as to provide a low level of accuracy. The purpose of this research is to increase the support vector machine accuracy with implementation of feature selection and feature weighting. Feature selection will reduce a large number of irrelevant attributes. In this study the feature is selected based on the top value of K = 500. Once selected the relevant attributes are then performed feature weighting to calculate the weight of each attribute selected. The feature selection method used is chi square statistic and feature weighting using Term Frequency Inverse Document Frequency (TFIDF). Result of experiment using Matlab R2017b is integration of support vector machine with chi square statistic and TFIDF that uses 10 fold cross validation gives an increase of accuracy of 11.5% with the following explanation, the accuracy of the support vector machine without applying chi square statistic and TFIDF resulted in an accuracy of 68.7% and the accuracy of the support vector machine by applying chi square statistic and TFIDF resulted in an accuracy of 80.2%.

Download Full-text

Peningkatan Akurasi pada Prediksi Kepribadian Mbti Pengguna Twitter Menggunakan Augmentasi Data

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020743622 ◽

2020 ◽

Vol 7 (4) ◽

pp. 815

Author(s):

Rizki Nurhaliza Harahap ◽

Kemas Muslim

Keyword(s):

Random Forest ◽

Text Mining ◽

Data Augmentation ◽

Myers Briggs Type Indicator ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Type Indicator ◽

Augmentation Techniques ◽

Personality Prediction

Kepribadian suatu individu perlu diketahui untuk membantu seseorang dalam mempertimbangkan beberapa hal, salah satunya perekrutan karier. Pada umumnya, kepribadian dapat diketahui melalui metode wawancara, observasi, maupun survei kuesioner. Akan tetapi, metode konvensional tersebut dinilai kurang praktis dari segi waktu dan materi karena dibutuhkan waktu yang lama dan biaya yang cukup besar untuk mengolah data. Selain itu, penggunaan metode konvensional juga dapat menimbulkan bias karena melibatkan orang ketiga dalam pengolahan data. Penelitian ini mencoba memberikan solusi dengan membangun model yang dapat melakukan prediksi terhadap kepribadian seseorang berdasarkan analisis data dan informasi dari media sosial Twitter. Data dan informasi tersebut akan diproses sehingga didapatkan prediksi kepribadian orang tersebut. Teori klasifikasi kepribadian yang digunakan adalah teori Myers-Briggs Type Indicator (MBTI). Penelitian ini juga mencoba menerapkan teknik augmentasi data untuk meningkatkan performa dari text mining task yang memiliki dataset sedikit. Hasil terbaik didapatkan dengan metode Random Forest menggunakan pembobotan Term Frequency-Inverse Document Frequency (TF-IDF) dan fitur yang tersedia pada Twitter. Penggunaan teknik augmentasi dapat meningkatkan akurasi hingga 30% dari akurasi awal sehingga hasil penelitian menunjukkan bahwa penggunaan teknik augmentasi data dapat meningkatkan performa pada model prediksi kepribadian MBTI.AbstractThe personality of an individual needs to be known to help people in considering things, one of them is career recruitment. In general, personality can be known through interviews, observations, and questionnaire surveys. However, the conventional method is judged to be impractical in terms of time and material because it takes a long time and has considerable costs to process data. After all, the use of conventional methods can also cause bias because it involves a third person in data processing. The research tries to provide a solution by building a system that can predict the personality of a person based on the analysis of data and information from social media Twitter. The data and information will be processed so that the personality prediction is obtained. The personality classification theory used is the Myers-Briggs Type Indicator (MBTI) theory. The research also tries to implement data augmentation techniques to improve the performance of text mining tasks that have a slight dataset. The best results are obtained by the Random Forest method using the Term Frequency-Inverse Document Frequency (TF-IDF) weighted and the features available on Twitter. The use of augmentation techniques can increase accuracy by up to 30% from initial accuracy. So, the use of data augmentation techniques can be used to improve the performance of MBTI personality prediction models.

Download Full-text

Sistem Rekomendasi Produk Pena Eksklusif Menggunakan Metode Content-Based Filtering dan TF-IDF

JOINTECS (Journal of Information Technology and Computer Science) ◽

10.31328/jointecs.v5i3.1563 ◽

2020 ◽

Vol 5 (3) ◽

pp. 229

Author(s):

Mariani Widia Putri ◽

Achmad Muchayan ◽

Made Kamisutara

Keyword(s):

Information Retrieval ◽

Customer Relationship Management ◽

Relationship Management ◽

Customer Relationship ◽

Brand Awareness ◽

Product Knowledge ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Content Based Filtering

Sistem rekomendasi saat ini sedang menjadi tren. Kebiasaan masyarakat yang saat ini lebih mengandalkan transaksi secara online dengan berbagai alasan pribadi. Sistem rekomendasi menawarkan cara yang lebih mudah dan cepat sehingga pengguna tidak perlu meluangkan waktu terlalu banyak untuk menemukan barang yang diinginkan. Persaingan antar pelaku bisnis pun berubah sehingga harus mengubah pendekatan agar bisa menjangkau calon pelanggan. Oleh karena itu dibutuhkan sebuah sistem yang dapat menunjang hal tersebut. Maka dalam penelitian ini, penulis membangun sistem rekomendasi produk menggunakan metode Content-Based Filtering dan Term Frequency Inverse Document Frequency (TF-IDF) dari model Information Retrieval (IR). Untuk memperoleh hasil yang efisien dan sesuai dengan kebutuhan solusi dalam meningkatkan Customer Relationship Management (CRM). Sistem rekomendasi dibangun dan diterapkan sebagai solusi agar dapat meningkatkan brand awareness pelanggan dan meminimalisir terjadinya gagal transaksi di karenakan kurang nya informasi yang dapat disampaikan secara langsung atau offline. Data yang digunakan terdiri dari 258 kode produk produk yang yang masing-masing memiliki delapan kategori dan 33 kata kunci pembentuk sesuai dengan product knowledge perusahaan. Hasil perhitungan TF-IDF menunjukkan nilai bobot 13,854 saat menampilkan rekomendasi produk terbaik pertama, dan memiliki keakuratan sebesar 96,5% dalam memberikan rekomendasi pena.

Download Full-text

Application of Customized Term Frequency-Inverse Document Frequency for Vietnamese Document Classification in Place of Lemmatization

Advances in Intelligent Systems and Computing - Intelligent Computing and Optimization ◽

10.1007/978-3-030-68154-8_37 ◽

2021 ◽

pp. 406-417

Author(s):

Do Viet Quan ◽

Phan Duy Hung

Keyword(s):

Document Classification ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency

Download Full-text

Term Frequency by Inverse Document Frequency

10.1007/springerreference_65918 ◽

2011 ◽

Keyword(s):

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency

Download Full-text

Hoax News Detection on Twitter using Term Frequency Inverse Document Frequency and Support Vector Machine Method

Journal of Physics Conference Series ◽

10.1088/1742-6596/1192/1/012025 ◽

2019 ◽

Vol 1192 ◽

pp. 012025

Author(s):

A Fauzi ◽

E B Setiawan ◽

Z K A Baizal

Keyword(s):

Support Vector Machine ◽

Support Vector ◽

Machine Method ◽

Inverse Document Frequency ◽

Support Vector Machine Method ◽

Term Frequency ◽

Document Frequency

Download Full-text

Seleksi Fitur Bobot Kata dengan Metode TFIDF untuk Ringkasan Bahasa Indonesia

Jurnal Ilmiah Merpati (Menara Penelitian Akademika Teknologi Informasi) ◽

10.24843/jim.2018.v06.i02.p06 ◽

2018 ◽

pp. 119

Author(s):

Ni Komang Widyasanti ◽

I Ketut Gede Darma Putra ◽

Ni Kadek Dwi Rusjayanthi

Keyword(s):

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Bahasa Indonesia

Penyebaran informasi dalam bentuk teks digital semakin tak terbendung seiring perkembangan waktu. Kebutuhan akan membaca informasi juga tidak pernah berkurang, berdasarkan riset yang dilakukan pada lima kota besar di Indonesia sepanjang tahun 2015 oleh okezone.com menyatakan persentasi konsumsi berita secara online mencapai 96%. Salah satu solusi untuk mempermudah dan mempercepat pencarian informasi yang sesuai adalah dengan meringkas konten tersebut. TFIDF (Term Frequency Inverse Document Frequency) merupakan metode pembobotan dalam bentuk integrasi antar term frequency dengan inverse document frequency. Metode TFIDF digunakan pada penelitian ini untuk memilih fitur sebagai hasil ringkasan, dengan penerapannya pada seleksi fitur bobot kata. Nilai kepuasan pembaca sebesar 61,94%. Durasi ringkasan rata-rata 68,25 detik dengan jumlah kalimat dan kata rata-rata 31,875 dan 387,375. Penelitian dilakukan menggunakan jenis dokumen fiksi dan non-fiksi serta seleksi fitur disetiap paragrafnya, yang membedakannya dengan penelitian terkait sebelumnya. Kata Kunci: Ringkasan Teks Otomatis, Pembobotan TFIDF, Bahasa Indonesia

Download Full-text