Building User Communities of Interests by Using Latent Semantic Analysis

Advances in Social Networking and Online Communities - Collaborative Search and Communities of Interest ◽

10.4018/978-1-61520-841-8.ch004 ◽

2011 ◽

pp. 38-68

Author(s):

Guandong Xu

Keyword(s):

Latent Semantic Analysis ◽

Web Mining ◽

Semantic Analysis ◽

Information Overload ◽

Critical Issue ◽

Web Usage Mining ◽

Semantic Level ◽

Web Usage ◽

User Communities ◽

Usage Patterns

Nowadays Web users are facing the problems of information overload and drowning due to the significant and rapid growth in the amount of information and the large number of users. As a result, how to provide Web users more exactly needed information is becoming a critical issue in Web-based information retrieval and data management. In order to address the above difficulties, Web mining was proposed as an efficient means to discover the intrinsic relationships among Web data. In particular, Web usage mining is to discover Web usage patterns and utilize the discovered usage knowledge for constructing interest-oriented user communities, which could be, in turn, used for presenting Web users more personalized Web contents, i.e. Web recommendation. On the other hand, Latent Semantic Analysis (LSA) is one kind of approaches that is used to reveal the inherent correlation resided in co-occurrence activities, such as Web usage data. Moreover, LSA possesses the capability of capturing the hidden knowledge at semantic level that can’t be achieved by traditional methods. In this chapter, we aim to address building user communities of interests via combining Web usage mining and latent semantic analysis. Meanwhile we also present the application of user communities for Web recommendation.

Download Full-text

Web usage mining based on probabilistic latent semantic analysis

Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '04 ◽

10.1145/1014052.1014076 ◽

2004 ◽

Cited By ~ 67

Author(s):

Xin Jin ◽

Yanzan Zhou ◽

Bamshad Mobasher

Keyword(s):

Latent Semantic Analysis ◽

Semantic Analysis ◽

Web Usage Mining ◽

Probabilistic Latent Semantic Analysis ◽

Web Usage

Download Full-text

Web Usage Mining and the Challenge of Big Data

Big Data ◽

10.4018/978-1-4666-9840-6.ch042 ◽

2016 ◽

pp. 899-928

Author(s):

Abubakr Gafar Abdalla ◽

Tarig Mohamed Ahmed ◽

Mohamed Elhassan Seliaman

Keyword(s):

Data Mining ◽

Pattern Discovery ◽

Web Usage Mining ◽

Data Mining Techniques ◽

Web Log ◽

Web Usage ◽

Web Logs ◽

Usage Patterns ◽

Rich Data ◽

The Web

The web is a rich data mining source which is dynamic and fast growing, providing great opportunities which are often not exploited. Web data represent a real challenge to traditional data mining techniques due to its huge amount and the unstructured nature. Web logs contain information about the interactions between visitors and the website. Analyzing these logs provides insights into visitors' behavior, usage patterns, and trends. Web usage mining, also known as web log mining, is the process of applying data mining techniques to discover useful information hidden in web server's logs. Web logs are primarily used by Web administrators to know how much traffic they get and to detect broken links and other types of errors. Web usage mining extracts useful information that can be beneficial to a number of application areas such as: web personalization, website restructuring, system performance improvement, and business intelligence. The Web usage mining process involves three main phases: pre-processing, pattern discovery, and pattern analysis. Various preprocessing techniques have been proposed to extract information from log files and group primitive data items into meaningful, lighter level abstractions that are suitable for mining, usually in forms of visitors' sessions. Major data mining techniques in web usage mining pattern discovery are: clustering, association analysis, classification, and sequential patterns discovery. This chapter discusses the process of web usage mining, its procedure, methods, and patterns discovery techniques. The chapter also presents a practical example using real web log data.

Download Full-text

Methodologies and Techniques of Web Usage Mining

Advances in Data Mining and Database Management - Web Usage Mining Techniques and Applications Across Industries ◽

10.4018/978-1-5225-0613-3.ch011 ◽

2017 ◽

pp. 275-296

Author(s):

T. Venkat Narayana Rao ◽

D. Hiranmayi

Keyword(s):

Web Mining ◽

Pattern Analysis ◽

Pattern Discovery ◽

Secondary Data ◽

Web Usage Mining ◽

Sequential Patterns ◽

Useful Knowledge ◽

Web Usage ◽

Automatic Discovery ◽

Collection Data

Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. It is the type of Web mining activity that involves the automatic discovery of out what users are looking for on the Internet. In this chapter methodology of web usage mining explained in detail which are data collection, data preprocessing, knowledge discovery and pattern analysis. The different Web Usage Mining techniques are described, which are used for knowledge and pattern discovery. These are statistical analysis, sequential patterns, classification, association rule mining, clustering, dependency modeling. Pattern analysis is needed to filter out uninterested rules or patterns from the set found in the pattern discovery phase.

Download Full-text

Analysis of Click Stream Patterns using Soft Biclustering Approaches

International Journal of Information Technologies and Systems Approach ◽

10.4018/jitsa.2011010104 ◽

2011 ◽

Vol 4 (1) ◽

pp. 53-66 ◽

Cited By ~ 1

Author(s):

P. K. Nizar Banu ◽

H. Inbarani

Keyword(s):

Machine Learning ◽

Data Mining ◽

Web Mining ◽

Web Usage Mining ◽

Web Personalization ◽

Partial Matching ◽

Web Usage ◽

Needed Information ◽

Highly Correlated ◽

Web Server Logs

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.

Download Full-text

Extracting Knowledge from Web Data

Journal of Information Technology Research ◽

10.4018/jitr.2014100103 ◽

2014 ◽

Vol 7 (4) ◽

pp. 27-41

Author(s):

Hanane Ezzikouri ◽

Mohamed Fakir ◽

Cherki Daoui ◽

Mohamed Erritali

Keyword(s):

Web Mining ◽

User Behavior ◽

Data Extraction ◽

Research Area ◽

Web Usage Mining ◽

Web Data ◽

Main Research ◽

Web Log ◽

Web Usage ◽

Log File

The user behavior on a website triggers a sequence of queries that have a result which is the display of certain pages. The Information about these queries (including the names of the resources requested and responses from the Web server) are stored in a text file called a log file. Analysis of server log file can provide significant and useful information. Web Mining is the extraction of interesting and potentially useful patterns and implicit information from artifacts or activity related to the World Wide Web. Web usage mining is a main research area in Web mining focused on learning about Web users and their interactions with Web sites. The motive of mining is to find users' access models automatically and quickly from the vast Web log file, such as frequent access paths, frequent access page groups and user clustering. Through Web Usage Mining, several information left by user access can be mined which will provide foundation for decision making of organizations, Also the process of Web mining was defined as the set of techniques designed to explore, process and analyze large masses of consecutive information activities on the Internet, has three main steps: data preprocessing, extraction of reasons of the use and the interpretation of results. This paper will start with the presentation of different formats of web log files, then it will present the different preprocessing method that have been used, and finally it presents a system for “Web content and Usage Mining'' for web data extraction and web site analysis using Data Mining Algorithms Apriori, FPGrowth, K-Means, KNN, and ID3.

Download Full-text

Penggunaan Metode berbasis Graph untuk Mining Frequent Sequential Access Pattern Pada Studi Kasus : Website iGracias Universitas Telkom

Indonesian Journal on Computing (Indo-JC) ◽

10.21108/indojc.2017.2.1.146 ◽

2017 ◽

Vol 2 (1) ◽

pp. 91 ◽

Cited By ~ 1

Author(s):

Rahmi Rohdiniyah ◽

Ibnu Asror ◽

Gede Agung Ary Wisudawan

Keyword(s):

Web Mining ◽

Web Usage Mining ◽

Access Pattern ◽

Web Usage ◽

Access Patterns

Penggunaan website pada bidang pendidikan, khususnya sebuah universitas, bertujuan untuk menyimpan berbagai informasi yang ada pada lingkungan universitas tersebut. Untuk itu, perlu dilakukan perbaikan struktur untuk memelihara kualitas dari web. Salah satu teknik yang dapat digunakan adalah dengan menggunakan web usage mining. Web usage mining merupakan salah satu cabang dari web mining yang digunakan untuk menemukan informasi atau pengetahuan yang bermanfaat dari pola navigasi user pada sebuah website. Pada penelitian ini menggunakan metode berbasis graph untuk frequent sequential access patterns dan menggunakan Igracias Universitas Telkom sebagai studi kasusnya. Karena Igracias selalu digunakan oleh seluruh entitas yang ada pada Universitas Telkom. Metode ini memiliki kelebihan untuk menemukan behavior pola pengaksesan user. Dari implementasi metoda ini didapat pola akses group user secara berurutan.

Download Full-text

Statistical Methods for User Profiling in Web Usage Mining

Handbook of Research on Text and Web Mining Technologies ◽

10.4018/978-1-59904-990-8.ch022 ◽

2010 ◽

pp. 359-368 ◽

Cited By ~ 1

Author(s):

Marcello Pecoraro

Keyword(s):

Statistical Methods ◽

Web Mining ◽

Web Usage Mining ◽

User Profiling ◽

Web Usage ◽

Data Object ◽

Segmentation Methods ◽

Binary Segmentation ◽

Usage Analysis ◽

The Web

This chapter aims at providing an overview about the use of statistical methods supporting the Web Usage Mining. Within the first part is described the framework of the Web Usage Mining as a branch of the Web Mining committed to the study of how to use a Website. Then, the data (object of the analysis) are detailed together with the problems linked to the pre-processing. Once clarified, the data origin and their treatment for a correct development of a Web Usage analysis,the focus shifts on the statistical techniques that can be applied to the analysis background, with reference to binary segmentation methods. Those latter allow the discrimination through a response variable that determines the affiliation of the users to a group by considering some characteristics detected on the same users.

Download Full-text

Semantic Analysis for Data Preparation of Web Usage Mining

Innovations in Applied Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24677-0_128 ◽

2004 ◽

pp. 1249-1258 ◽

Cited By ~ 1

Author(s):

Jason J. Jung ◽

Geun-Sik Jo

Keyword(s):

Semantic Analysis ◽

Web Usage Mining ◽

Data Preparation ◽

Web Usage

Download Full-text

Latent Semantic Analysis for Text Mining and Beyond

Intelligent Multimedia Databases and Information Retrieval ◽

10.4018/978-1-61350-126-9.ch015 ◽

2013 ◽

pp. 253-280 ◽

Cited By ~ 2

Author(s):

Anne Kao ◽

Steve Poteet ◽

Jason Wu ◽

William Ferng ◽

Rod Tjoelker ◽

...

Keyword(s):

Information Retrieval ◽

Text Mining ◽

Latent Semantic Analysis ◽

Web Mining ◽

Semantic Analysis ◽

Search Space ◽

Latent Semantic Indexing ◽

Cross Language Information Retrieval ◽

Text Information ◽

Cross Language

Latent Semantic Analysis (LSA) or Latent Semantic Indexing (LSI), when applied to information retrieval, has been a major analysis approach in text mining. It is an extension of the vector space method in information retrieval, representing documents as numerical vectors but using a more sophisticated mathematical approach to characterize the essential features of the documents and reduce the number of features in the search space. This chapter summarizes several major approaches to this dimensionality reduction, each of which has strengths and weaknesses, and it describes recent breakthroughs and advances. It shows how the constructs and products of LSA applications can be made user-interpretable and reviews applications of LSA beyond information retrieval, in particular, to text information visualization. While the major application of LSA is for text mining, it is also highly applicable to cross-language information retrieval, Web mining, and analysis of text transcribed from speech and textual information in video.

Download Full-text

Performance Comparison of Data Mining Classifiers on Web Log Data

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9349 ◽

2020 ◽

Vol 17 (11) ◽

pp. 5113-5116

Author(s):

Varun Malik ◽

Vikas Rattan ◽

Jaiteg Singh ◽

Ruchi Mittal ◽

Urvashi Tandon

Keyword(s):

Web Mining ◽

Performance Comparison ◽

Web Usage Mining ◽

Classification Algorithms ◽

Web Content ◽

Web Usage ◽

Web Structure ◽

Web Structure Mining ◽

Content Mining ◽

The Web

Web usage mining is the branch of web mining that deals with mining of data over the web. Web mining can be categorized as web content mining, web structure mining, web usage mining. In this paper, we have summarized the web usage mining results executed over the user tool WMOT (web mining optimized tool) based on the WEKA tool that has been used to apply various classification algorithms such as Naïve Bayes, KNN, SVM and tree based algorithms. Authors summarized the results of classification algorithms on WMOT tool and compared the results on the basis of classified instances and identify the algorithms that gives better instances accuracy.

Download Full-text