Adaptive and Optimization of Personalized Information Retrieval Model in Semantic Web

The recognition of user’s visited set of web pages for the prediction of web page is a key drawback. Thus the work is employed with the web access log files which is stored in the server. To understand the user interest patterns, the web access log files are extracted that depicts the user behavior. Various applications can be employed to predicting user’s behavior while serving the web. During this work, the proposed framework analyze the user usage, reinforced the content and the content retrieved with the semantic manner. The semantic information retrieval supported the user access pages are preprocessed and the web log data of the particular user is analyzed to identify the user profile. Then the retrieved information is graded with clustering the semantic content based results. The ranked content is then analyzed with the user profile to produce an optimized search results for the users based on the user classification.

Download Full-text

A method of query expansion based on topic models and user profile for search in folksonomy

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210508 ◽

2021 ◽

pp. 1-11

Author(s):

Zhinan Gou ◽

Yan Li

Keyword(s):

Information Retrieval ◽

Query Expansion ◽

Information Overload ◽

Topic Model ◽

User Profile ◽

Expansion Method ◽

Collaborative Tagging ◽

Search Query ◽

Tagging System ◽

The Web

With the development of the web 2.0 communities, information retrieval has been widely applied based on the collaborative tagging system. However, a user issues a query that is often a brief query with only one or two keywords, which leads to a series of problems like inaccurate query words, information overload and information disorientation. The query expansion addresses this issue by reformulating each search query with additional words. By analyzing the limitation of existing query expansion methods in folksonomy, this paper proposes a novel query expansion method, based on user profile and topic model, for search in folksonomy. In detail, topic model is constructed by variational antoencoder with Word2Vec firstly. Then, query expansion is conducted by user profile and topic model. Finally, the proposed method is evaluated by a real dataset. Evaluation results show that the proposed method outperforms the baseline methods.

Download Full-text

Application of Data Mining on Web Usage Data for Security: WebSecuDMiner

10.20944/preprints201909.0040.v1 ◽

2019 ◽

Author(s):

Muhammad Zia Aftab Khan ◽

Jihyun Park

Keyword(s):

Design Methodology ◽

Access Pattern ◽

User Research ◽

Web Log ◽

Web Access ◽

User Access ◽

Log File ◽

Access Patterns ◽

Web Access Pattern ◽

The Web

The purpose of this paper is to develop WebSecuDMiner algorithm to discover unusual web access patterns based on analysing the potential rules hidden in web server log and user navigation history. Design/methodology/approach: WebSecuDMiner uses equivalence class transformation (ECLAT) algorithm to extract user access patterns from the web log data, which will be used to identify the user access behaviours pattern and detect unusual one. Data extracted from the web serve log and user browsing behaviour is exploited to retrieve the web access pattern that is produced by the same user. Findings: WebSecuDMiner is used to detect whether any unauthorized access have been posed and take appropriate decisions regarding the review of the original rights of suspicious user. Research limitations/implications: The present work uses the database which is extracted from web serve log file and user browsing behaviour. Although the page is viewed by the user, the visit is not recorded in the server log file, since it can be access from the browser's cache.

Download Full-text

User Models for Adaptive Information Retrieval on the Web

International Journal of Adaptive Resilient and Autonomic Systems ◽

10.4018/jaras.2012070101 ◽

2012 ◽

Vol 3 (3) ◽

pp. 1-19

Author(s):

Max Chevalier ◽

Christine Julien ◽

Chantal Soulé-Dupuy

Keyword(s):

Information Retrieval ◽

Search Engines ◽

User Profile ◽

User Model ◽

User Models ◽

Search Results ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Adaptive Information Retrieval ◽

The Web

Searching information can be realized thanks to specific tools called Information Retrieval Systems IRS (also called “search engines”). To provide more accurate results to users, most of such systems offer personalization features. To do this, each system models a user in order to adapt search results that will be displayed. In a multi-application context (e.g., when using several search engines for a unique query), personalization techniques can be considered as limited because the user model (also called profile) is incomplete since it does not exploit actions/queries coming from other search engines. So, sharing user models between several search engines is a challenge in order to provide more efficient personalization techniques. A semantic architecture for user profile interoperability is proposed to reach this goal. This architecture is also important because it can be used in many other contexts to share various resources models, for instance a document model, between applications. It is also ensuring the possibility for every system to keep its own representation of each resource while providing a solution to easily share it.

Download Full-text

They Know What You Will Do Next Click

Interdisciplinary Approaches to Digital Transformation and Innovation - Advances in E-Business Research ◽

10.4018/978-1-7998-1879-3.ch005 ◽

2020 ◽

pp. 100-122

Author(s):

Serra Çelik

Keyword(s):

Focus Group ◽

User Behavior ◽

Web Usage Mining ◽

Web Log ◽

Web Usage ◽

User Behaviors ◽

Log Files ◽

The Web

This chapter focuses on predicting web user behaviors. When web users enter a website, every move they make on that website is stored as web log files. Unlike the focus group or questionnaire, the log files reflect real user behavior. It can easily be said that having actual user behavior is a gold value for the organizations. In this chapter, the ways of extracting user patterns (user behavior) from the log files are sought. In this context, the web usage mining process is explained. Some web usage mining techniques are mentioned.

Download Full-text

Improving Webpage Access Predictions Based on Sequence Prediction and PageRank Algorithm

Interdisciplinary Journal of Information Knowledge and Management ◽

10.28945/4176 ◽

2019 ◽

Vol 14 ◽

pp. 027-044 ◽

Cited By ~ 1

Author(s):

Da Thon Nguyen ◽

Hanh T Tan ◽

Duy Hoang Pham

Keyword(s):

Web Mining ◽

User Behavior ◽

User Profile ◽

Experimental Results ◽

Prediction Algorithm ◽

Future Research ◽

Pagerank Algorithm ◽

Product Recommendation ◽

Redundant Data ◽

The Web

Aim/Purpose: In this article, we provide a better solution to Webpage access prediction. In particularly, our core proposed approach is to increase accuracy and efficiency by reducing the sequence space with integration of PageRank into CPT+. Background: The problem of predicting the next page on a web site has become significant because of the non-stop growth of Internet in terms of the volume of contents and the mass of users. The webpage prediction is complex because we should consider multiple kinds of information such as the webpage name, the contents of the webpage, the user profile, the time between webpage visits, differences among users, and the time spent on a page or on each part of the page. Therefore, webpage access prediction draws substantial effort of the web mining research community in order to obtain valuable information and improve user experience as well. Methodology: CPT+ is a complex prediction algorithm that dramatically offers more accurate predictions than other state-of-the-art models. The integration of the importance of every particular page on a website (i.e., the PageRank) regarding to its associations with other pages into CPT+ model can improve the performance of the existing model. Contribution: In this paper, we propose an approach to reduce prediction space while improving accuracy through combining CPT+ and PageRank algorithms. Experimental results on several real datasets indicate the space reduced by up to between 15% and 30%. As a result, the run-time is quicker. Furthermore, the prediction accuracy is improved. It is convenient that researchers go on using CPT+ to predict Webpage access. Findings: Our experimental results indicate that PageRank algorithm is a good solution to improve CPT+ prediction. An amount of though approximately 15 % to 30% of redundant data is removed from datasets while improving the accuracy. Recommendations for Practitioners: The result of the article could be used in developing relevant applications such as Webpage and product recommendation systems. Recommendation for Researchers: The paper provides a prediction model that integrates CPT+ and PageRank algorithms to tackle the problem of complexity and accuracy. The model has been experimented against several real datasets in order to show its performance. Impact on Society: Given an improving model to predict Webpage access using in several fields such as e-learning, product recommendation, link prediction, and user behavior prediction, the society can enjoy a better experience and more efficient environment while surfing the Web. Future Research: We intend to further improve the accuracy of webpage access prediction by using the combination of CPT+ and other algorithms.

Download Full-text

Indexing Techniques for Web Access Logs

Web Information Systems ◽

10.4018/978-1-59140-208-4.ch009 ◽

2004 ◽

pp. 305-334 ◽

Cited By ~ 2

Author(s):

Yannis Manolopoulos ◽

Mikolaj Morzy ◽

Tadeusz Morzy ◽

Alexandros Nanopoulos ◽

Marek Wojciechowski ◽

...

Keyword(s):

Log Data ◽

Web Log ◽

Indexing Techniques ◽

Web Access ◽

User Access ◽

Single User ◽

Efficient Processing ◽

Access Logs ◽

Web Access Logs ◽

The Web

Access histories of users visiting a web server are automatically recorded in web access logs. Conceptually, the web-log data can be regarded as a collection of clients’ access-sequences, where each sequence is a list of pages accessed by a single user in a single session. This chapter presents novel indexing techniques that support efficient processing of so-called pattern queries, which consist of finding all access sequences that contain a given subsequence. Pattern queries are a key element of advanced analyses of web-log data, especially those concerning typical navigation schemes. In this chapter, we discuss the particularities of efficiently processing user access-sequences with pattern queries, compared to the case of searching unordered sets. Extensive experimental results are given, which examine a variety of factors and illustrate the superiority of the proposed methods over indexing techniques for unordered data adapted to access sequences.

Download Full-text

An Intelligent Web Caching System for Improving the Performance of a Web-Based Information Retrieval System

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2020100102 ◽

2020 ◽

Vol 16 (4) ◽

pp. 26-44

Author(s):

Sathiyamoorthi V. ◽

Suresh P. ◽

Jayapandian N. ◽

Kanmani P. ◽

Deva Priya M. ◽

...

Keyword(s):

Information Retrieval ◽

Web Caching ◽

Web Pages ◽

Data Sets ◽

Data Traffic ◽

Content Delivery Network ◽

Web Based ◽

Long Time ◽

User Access ◽

The Web

With an increasing number of web users, the data traffic generated by these users generates tremendous network traffic which takes a long time to connect with the web server. The main reason is, the distance between the client making requests and the servers responding to those requests. The use of the CDN (content delivery network) is one of the strategies for minimizing latency. But, it incurs additional cost. Alternatively, web caching and preloading are the most viable approaches to this issue. It is therefore decided to introduce a novel web caching strategy called optimized popularity-aware modified least frequently used (PMLFU) policy for information retrieval based on users' past access history and their trends analysis. It helps to enhance the proxy-driven web caching system by analyzing user access requests and caching the most popular web pages driven on their preferences. Experimental results show that the proposed systems can significantly reduce the user delay in accessing the web page. The performance of the proposed system is measured using IRCACHE data sets in real time.

Download Full-text

Web Usage Mining: A Survey on Pattern Extraction from Web Logs

International Journal of Instrumentation Control and Automation ◽

10.47893/ijica.2011.1004 ◽

2011 ◽

pp. 15-23

Author(s):

S. K. Pani ◽

L. Panigrahy ◽

V.H. Sankar ◽

A.K. Manda ◽

S.K. Padhi ◽

...

Keyword(s):

Web Usage Mining ◽

Pattern Extraction ◽

User Behaviour ◽

Web Usage ◽

Log Files ◽

Web Logs ◽

Interesting Pattern ◽

Web Access ◽

The Web

As the size of web increases along with number of users, it is very much essential for the website owners to better understand their customers so that they can provide better service, and also enhance the quality of the website. To achieve this they depend on the web access log files. The web access log files can be mined to extract interesting pattern so that the user behaviour can be understood. This paper presents an overview of web usage mining and also provides a survey of the pattern extraction algorithms used for web usage mining.

Download Full-text

Ontology-based effective information retrieval from the web using concept aware user profile construction

International Journal of Enterprise Network Management ◽

10.1504/ijenm.2018.10015852 ◽

2018 ◽

Vol 9 (3/4) ◽

pp. 376

Author(s):

M. Mohamed Iqbal ◽

P. Senthil Kumar ◽

J. Abdul Samath

Keyword(s):

Information Retrieval ◽

User Profile ◽

The Web

Download Full-text

Online Analytical Mining for Web Access Patterns

Advances in Database Research - Advanced Topics in Database Research, Volume 3 ◽

10.4018/978-1-59140-255-8.ch015 ◽

2011 ◽

pp. 294-326

Author(s):

Joseph Fong ◽

Hing K. Wong ◽

Anthony Fong

Keyword(s):

Data Warehouse ◽

Information Services ◽

Web Pages ◽

Distributed Information ◽

Web Page Design ◽

Web Access ◽

User Access ◽

Highly Correlated ◽

Access Patterns ◽

The Web

The WWW and its associated distributed information services provide rich world-wide online information services, where objects are linked together to facilitate interactive access. Users seeking information from the Internet traverse from one object via links to another. It is important to analyze user access patterns, which helps improve web page design by providing an efficient access between highly correlated objects, and also assists in better marketing decisions by placing advertisements in frequently visited documents. We need to study the user surfing behavior through examining the web access log, browsing frequency of web pages and computing the average duration of visitors. This chapter offers an architecture to store the derived web user access paths in a data warehouse, and facilitates its view maintainability by use of metadata. The system will update the user access paths pattern with the data warehouse by the data operation functions in the metadata. Whenever a new user access path occurs, the view maintainability is triggered by a constraint class in the metadata. The data warehouse can be analyzed on the frequent pattern tree of user access paths on the web site within a period and duration. The result is an online analytical mining path traversal pattern. Performance studies have been done to demonstrate the effectiveness and efficiency of the system with the following contributions: an architecture of online analytical mining using frame model metadata, a methodology of implementing the online analytical mining, and the resultant cluster of web pages frequently visited by users for marketing use.

Download Full-text