Chameleon Clustering Algorithm with Semantic Analysis Algorithm for Efficient Web Usage Mining

2015 ◽  
Vol 10 (6) ◽  
pp. 580
Author(s):  
Anupama Prasanth ◽  
M. Hemalatha
Author(s):  
Guandong Xu

Nowadays Web users are facing the problems of information overload and drowning due to the significant and rapid growth in the amount of information and the large number of users. As a result, how to provide Web users more exactly needed information is becoming a critical issue in Web-based information retrieval and data management. In order to address the above difficulties, Web mining was proposed as an efficient means to discover the intrinsic relationships among Web data. In particular, Web usage mining is to discover Web usage patterns and utilize the discovered usage knowledge for constructing interest-oriented user communities, which could be, in turn, used for presenting Web users more personalized Web contents, i.e. Web recommendation. On the other hand, Latent Semantic Analysis (LSA) is one kind of approaches that is used to reveal the inherent correlation resided in co-occurrence activities, such as Web usage data. Moreover, LSA possesses the capability of capturing the hidden knowledge at semantic level that can’t be achieved by traditional methods. In this chapter, we aim to address building user communities of interests via combining Web usage mining and latent semantic analysis. Meanwhile we also present the application of user communities for Web recommendation.


2011 ◽  
Vol 63-64 ◽  
pp. 863-867 ◽  
Author(s):  
Bin Li ◽  
Jin Yang ◽  
Cai Ming Liu ◽  
Jian Dong Zhang ◽  
Yan Zhang

Clustering analysis is an important method to research the Web user’s browsing behavior and identify the potential customers on Web usage mining. The traditional user clustering algorithms are not quite accurate. In this paper, we give two improved user clustering algorithms, which are based on the associated matrix of the user’s hits in the process of browsing website. To this matrix, an improved Hamming distance matrix is generated by defining the minimum norm or the generalized relative Hamming distance between any two vectors. Then, similar user clustering are obtained by setting the threshold value. At the last step of our algorithm, the clustering results are confirmed by defining the clustering’s Similar Index and setting sub-algorithm. Finally, the testing examples show that the new algorithms are more accurate than the old one, and the real log data presents that the improved algorithms are practical.


2018 ◽  
Vol 8 (2) ◽  
pp. 141-153
Author(s):  
Sutrisno Heru Sukoco ◽  
Imas Sukaesih Sitanggang ◽  
Heru Sukoco

Pengukuran kinerja pegawai dalam penggunaan layanan internet dapat dilakukan sebagai bagian dari penilaian kinerja. Pendekatan web usage mining melalui pengamatan rekam jejak akses internet yang tersimpan pada proxy server merupakan salah satu cara yang dapat diterapkan untuk memahami perilaku pengguna. Penelitian ini bertujuan untuk mendapatkan gambaran perilaku pegawai Pusbindiklat Peneliti LIPI dalam memanfaatkan layanan internet, mengukur level produktivitas pegawai berdasarkan lama waktu akses terhadap situs yang tidak mendukung pekerjaan dan memetakan kategori situs yang diakses apakah medukung tugas fungsi jabatannya. Penerapan algoritme clustering K-Means digunakan untuk memudahkan memahami pola akses pengguna. Data yang digunakan adalah log proxy server dan nilai prilaku pegawai Pusbindiklat Peneliti LIPI  periode Agustus-Desember 2016. Hasil penelitian menunjukkan pola pemanfaatan internet oleh pegawai Pusbindiklat Peneliti LIPI belum sepenuhnya mendukung tugas fungsi jabatannya. Sekitar 83% pegawai menggunakan internet untuk mengakses situs yang tidak mendukung pekerjaan berada pada level rendah (0-4 jam per minggu). Berdasarkan hasil tersebut dapat disimpulkan bahwa prilaku penggunaan internet yang dilakukan pegawai Pusbindiklat Peneliti LIPI  tidak mempengaruhi produktivitas secara signifikan.AbstractMeasurement of employee performance in the use of internet services can be conducted as part of employee’s performance target. Web usage mining approach through observation of internet access records stored in the proxy server can be applied in understanding user behavior. This study aims to obtain an overview of employee behavior in utilizing internet services in Pusbindiklat Peneliti LIPI, measure the level of employee productivity based on the length of time access to sites that do not support the work and map the category of sites accessed to the task dutyof employee.  K-Means clustering algorithm is used to group  user access patterns. The data used are proxy server logs and employee’s performance target in Pusbindiklat Peneliti LIPI  in period of August-December 2016. The results shows that  the pattern of Internet use by employees Pusbindiklat Peneliti LIPI  do not fully support the job function. About 83% of employees use the internet to access sites do not support jobs at low level access (ranging from 0-4 hours per week). Based on these results, it can be concluded that the behavior of internet use by employees of Pusbindiklat Peneliti LIPI does not affect their productivity significantly. Keywords: clustering, K-Means, log proxy server, performance of employees, web usage mining


2011 ◽  
Vol 219-220 ◽  
pp. 887-891
Author(s):  
Jiang Zhong ◽  
Yi Feng Cheng ◽  
Shi Tao Deng

Web usage mining technique is widely used for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining can only discover usage pattern explicitly. In order to employ the users’ feature and web pages’ attributes to get more accuracy recommendation, we propose a unified collaborative filtering model for web recommendation which combined the latent and external features of users and web page through back propagation neural networks. In the algorithm, we employ Probabilistic Latent Semantic Analysis (PLSA) method to get latent features. The main advantages of this technique over standard memory-based methods are the higher accuracy, constant time prediction, and an explicit and compact model representation. The preliminary experimental evaluation shows that substantial improvements in accuracy over existing methods can be obtained.


2018 ◽  
Vol 8 (2) ◽  
pp. 141
Author(s):  
Sutrisno Heru Sukoco ◽  
Imas Sukaesih Sitanggang ◽  
Heru Sukoco

<em><span>Pengukuran kinerja pegawai dalam penggunaan layanan internet dapat dilakukan sebagai bagian dari penilaian kinerja. Pendekatan web usage mining melalui pengamatan rekam jejak akses internet yang tersimpan pada proxy server merupakan salah satu cara yang dapat diterapkan untuk memahami perilaku pengguna. Penelitian ini bertujuan untuk mendapatkan gambaran perilaku pegawai Pusbindiklat Peneliti LIPI dalam memanfaatkan layanan internet, mengukur level produktivitas pegawai berdasarkan lama waktu akses terhadap situs yang tidak mendukung pekerjaan dan memetakan kategori situs yang diakses apakah medukung tugas fungsi jabatannya. Penerapan algoritme </span></em><span>clustering<em> K-Means digunakan untuk memudahkan memahami pola akses pengguna. Data yang digunakan adalah log proxy server dan nilai prilaku pegawai Pusbindiklat Peneliti LIPI  periode Agustus-Desember 2016. Hasil penelitian menunjukkan pola pemanfaatan internet oleh pegawai Pusbindiklat Peneliti LIPI belum sepenuhnya mendukung tugas fungsi jabatannya. Sekitar 83% pegawai menggunakan internet untuk mengakses situs yang tidak mendukung pekerjaan berada pada level rendah (0-4 jam per minggu). Berdasarkan hasil tersebut dapat disimpulkan bahwa prilaku penggunaan internet yang dilakukan pegawai Pusbindiklat Peneliti LIPI  tidak mempengaruhi produktivitas secara signifikan.</em></span><div><span><em><br /></em></span></div><div><p class="JGI-KeteranganPenulis" align="center"><strong><em>Abstract</em></strong><em></em></p><p class="JGI-AbstractIsi">Measurement of employee performance in the use of internet services can be conducted as part of employee’s performance target. Web usage mining approach through observation of internet access records stored in the proxy server can be applied in understanding user behavior. This study aims to obtain an overview of employee behavior in utilizing internet services in Pusbindiklat Peneliti LIPI, measure the level of employee productivity based on the length of time access to sites that do not support the work and map the category of sites accessed to the task dutyof employee.  K-Means clustering algorithm is used to group  user access patterns. The data used are proxy server logs and employee’s performance target in Pusbindiklat Peneliti LIPI  in period of August-December 2016. The results shows that  the pattern of Internet use by employees Pusbindiklat Peneliti LIPI  do not fully support the job function. About 83% of employees use the internet to access sites do not support jobs at low level access (ranging from 0-4 hours per week). Based on these results, it can be concluded that the behavior of internet use by employees of Pusbindiklat Peneliti LIPI does not affect their productivity significantly.</p><p class="JGI-AbstractIsi"> </p><span><em><strong>Keywords</strong>: clustering, K-Means, log proxy server, performance of employees, web usage mining<br /></em></span></div>


Sign in / Sign up

Export Citation Format

Share Document