Webpage Recommendation System Using Interesting Subgraphs and Laplace Based k-Nearest Neighbor

Author(s):  
N. Jayalakshmi ◽  
P. Padmaja ◽  
G. Jaya Suma

An interesting research area that permits the user to mine the significant information, called frequent subgraph, is Graph-Based Data Mining (GBDM). One of the well-known algorithms developed to extract frequent patterns is GASTON algorithm. Retrieving the interesting webpages from the log files contributes heavily to various applications. In this work, a webpage recommendation system has been proposed by introducing Chronological Cuckoo Search (Chronological-CS) algorithm and the Laplace correction based k-Nearest Neighbor (LKNN) to retrieve the useful webpage from the interesting webpage. Initially, W-Gaston algorithm extracts the interesting subgraph from the log files and provides it to the proposed webpage recommendation system. The interesting subgraphs subjected to clustering with the proposed Chronological-CS algorithm, which is developed by integrating the chronological concept into Cuckoo Search (CS) algorithm, provide various cluster groups. Then, the proposed LKNN algorithm recommends the webpage from the clusters. Simulation of the proposed webpage recommendation algorithm is done by utilizing the data from MSNBC and weblog database. The results are compared with various existing webpage recommendation models and analyzed based on precision, recall, and F-measure. The proposed webpage recommendation model achieved better performance than the existing models with the values of 0.9194, 0.8947, and 0.86736, respectively, for the precision, recall, and F-measure.

2020 ◽  
Vol 8 (4) ◽  
pp. 367
Author(s):  
Muhammad Arief Budiman ◽  
Gst. Ayu Vida Mastrika Giri

The development of the music industry is currently growing rapidly, millions of music works continue to be issued by various music artists. As for the technologies also follows these developments, examples are mobile phones applications that have music subscription services, namely Spotify, Joox, GrooveShark, and others. Application-based services are increasingly in demand by users for streaming music, free or paid. In this paper, a music recommendation system is proposed, which the system itself can recommend songs based on the similarity of the artist that the user likes or has heard. This research uses Collaborative Filtering method with Cosine Similarity and K-Nearest Neighbor algorithm. From this research, a system that can recommend songs based on artists who are related to one another is generated.


2020 ◽  
Vol 9 (2) ◽  
pp. 267
Author(s):  
I Gede Teguh Mahardika ◽  
I Wayan Supriana

Culinary is one of the favorite businesses today. The number of considerations to choose a restaurant or place to visit becomes one of the factors that is difficult to determine the restaurant or place to eat. To get the desired place to eat advice, one needs a recommendation system. Decisions made by the recommendation system can be used as a reference to determine the choice of restaurants. One method that can be used to build a recommendation system is Case Based Reasoning. The Case Based Reasoning (CBR) method mimics human ability to solve a problem or cases. The retrieval process is the most important stage, because at this stage the search for a solution for a new case is carried out. The study used the K-Nearest Neighbor method to find closeness between new cases and case bases. With the selection of features used as domains in the system, the results of recommendations presented can be more suggestive and accurate. The system successfully provides complex recommendations based on the type and type of food entered by the user. Based on blackbox testing, the system has features that can be used and function properly according to the purpose of creating the system.


Author(s):  
Mohamed Loey ◽  
Mukdad Rasheed Naman ◽  
Hala Helmy Zayed

Blood disease detection and diagnosis using blood cells images is an interesting and active research area in both the computer and medical fields. There are many techniques developed to examine blood samples to detect leukemia disease, these techniques are the traditional techniques and the deep learning (DL) technique. This article presents a survey on the different traditional techniques and DL approaches that have been employed in blood disease diagnosis based on blood cells images and to compare between the two approaches in quality of assessment, accuracy, cost and speed. This article covers 19 studies, 11 of these studies were in traditional techniques which used image processing and machine learning (ML) algorithms such as K-means, K-nearest neighbor (KNN), Naïve Bayes, Support Vector Machine (SVM), and 8 studies in advanced techniques which used DL, particularly Convolutional Neural Networks (CNNs) which is the most widely used in the field of blood image diseases detection since it is highly accurate, fast, and has the least cost. In addition, it analyzes a number of recent works that have been introduced in the field including the size of the dataset, the used methodologies, the obtained results, etc. Finally, based on the conducted study, it can be concluded that the proposed system CNN was achieving huge successes in the field whether regarding features extraction or classification task, time, accuracy, and had a lower cost in the detection of leukemia diseases.


2005 ◽  
Vol 02 (02) ◽  
pp. 167-180
Author(s):  
SEUNG-JOON OH ◽  
JAE-YEARN KIM

Clustering of sequences is relatively less explored but it is becoming increasingly important in data mining applications such as web usage mining and bioinformatics. The web user segmentation problem uses web access log files to partition a set of users into clusters such that users within one cluster are more similar to one another than to the users in other clusters. Similarly, grouping protein sequences that share a similar structure can help to identify sequences with similar functions. However, few clustering algorithms consider sequentiality. In this paper, we study how to cluster sequence datasets. Due to the high computational complexity of hierarchical clustering algorithms for clustering large datasets, a new clustering method is required. Therefore, we propose a new scalable clustering method using sampling and a k-nearest-neighbor method. Using a splice dataset and a synthetic dataset, we show that the quality of clusters generated by our proposed approach is better than that of clusters produced by traditional algorithms.


2019 ◽  
Vol 2 (1) ◽  
pp. 41-48
Author(s):  
Rimbun Siringoringo ◽  
Jamaludin Jamaludin

Pertumbuhan media sosial dan e-commerce mengubah cara berinteraksi dan menyampaikan pandangan, opini dan mood. Ulasan produk merupakan salah satu bentuk penyampaian opini dan sentimen konsumen terhadap sebuah produk secara online. Ulasan produk saat ini memiliki peranan yang sangat penting dalam mempengaruhi minat konsumen terhadap sebuah produk.  Analisis sentimen merupakan pendekatan yang banyak dikerjakan untuk mengekstrak informasi dan menggali opini berkaitan dengan ulasan produk. Analisis sentimen memiliki beberapa tantangan, yang pertama sering sekali hasil analisis sentimen yang dihasilkan oleh model-model prediksi berbeda dengan sentimen yang aktual, tantangan kedua adalah berkaitan dengan cara konsumen mengekpresikan sentimen dan mood selalu berbeda dari satu keadaan ke keadaan berikutnya. Pada penelitian ini dilakukan analisis sentimen berdasarkan ulasan produk sepatu Trendy Shoes merek Denim. Tahapan analisis sentimen terdiri dari pengumpulan data, pemrosesan awal, transformasi data, seleksi fitur dan tahapan klasifikasi menggunakan Suppport Vector Machine. Pemrosesan awal menerapkan tahapan text mining yakni case folding, non alpha numeric removal, stop words removal, dan stemming. Hasil analisis sentimen diukur menggunakan kriteria Akurasi, G-Mean, dan F-Measure. Dengan menerapkan pengujian pada tiga jenis data sentimen diperoleh hasil bahwa Suppport Vector Machine dapat mengklasifikasi sentimen dengan baik. Performa Suppport Vector Machine dibandingkan  dengan metode K-Nearest Neighor. Hasil klasifiasi sentimen menggunakan Suppport Vector Machine lebih unggul dari  K-Nearest Neighbor.


Author(s):  
Dicki Pajri ◽  
Yuyun Umaidah ◽  
Tesa Nur Padilah

Tokopedia is a popular marketplace used by e-commerce in Indonesia. Customers’ perception of Twitter towards Tokopedia can be used as an important source of information and can be processed into useful insights. Sentiment analysis is a solution that can be used to process the customers’ perception using K-Nearest Neighbor based on Particle Swarm Optimization. The purpose of this study is to classify customers’ perception based on positive, neutral, and negative classes. The test is carried out with four different scenarios and k values which are evaluated using a confusion matrix. Evaluation results showed the distribution of the dataset is 90:10 and the value of k = 1 is the best evaluation result, which is 88.11%. The feature selection was used for results by using Particle Swarm Optimization. The Particle Swarm Optimization used 20 iterations and 10 particles. It produced 97.9% the best evaluation accuracy, 96.17% precision, 96.62% recall, and 96.39% f-measure.


Sign in / Sign up

Export Citation Format

Share Document