Building Concept Network-Based User Profile for Personalized Web Search

With the advancement in ICT, web search engines have become a preferred source to find health-related information published over the Internet. Google alone receives more than one billion health-related queries on a daily basis. However, in order to provide the results most relevant to the user, WSEs maintain the users’ profiles. These profiles may contain private and sensitive information such as the user’s health condition, disease status, and others. Health-related queries contain privacy-sensitive information that may infringe user’s privacy, as the identity of a user is exposed and may be misused by the WSE and third parties. This raises serious concerns since the identity of a user is exposed and may be misused by third parties. One well-known solution to preserve privacy involves issuing the queries via peer-to-peer private information retrieval protocol, such as useless user profile (UUP), thereby hiding the user’s identity from the WSE. This paper investigates the level of protection offered by UUP. For this purpose, we present QuPiD (query profile distance) attack: a machine learning-based attack that evaluates the effectiveness of UUP in privacy protection. QuPiD attack determines the distance between the user’s profile (web search history) and upcoming query using our proposed novel feature vector. The experiments were conducted using ten classification algorithms belonging to the tree-based, rule-based, lazy learner, metaheuristic, and Bayesian families for the sake of comparison. Furthermore, two subsets of an America Online dataset (noisy and clean datasets) were used for experimentation. The results show that the proposed QuPiD attack associates more than 70% queries to the correct user with a precision of over 72% for the clean dataset, while for the noisy dataset, the proposed QuPiD attack associates more than 40% queries to the correct user with 70% precision.

Download Full-text

Web Personalization for Chinese Travel Websites

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.474-476.1470 ◽

2011 ◽

Vol 474-476 ◽

pp. 1470-1474

Author(s):

Shih Ming Pi

Keyword(s):

Web Sites ◽

User Satisfaction ◽

Web Search ◽

User Profile ◽

Prototype System ◽

Web Personalization ◽

Travel Information ◽

Textual Data ◽

Single User ◽

Browsing Behavior

In this study, we proposed a conceptual architecture of web personalization based on subject taxonomy tree and click-through analyses in order to improve the browsing efficiency and user satisfaction. In order to construct user profile, a hierarchal subject taxonomy tree of travel information was built. This tree has five attributes which represent the interests of a single user. Each user has his profile for generating personal categories while searching. The system then adjusts user profiles according to each user’s browsing behavior in order to learn different interests of each user. Textual data in Chinese travel web sites are used for experimental data and a prototype system is implemented in order to evaluate the proposed architecture. The result shows that personal classification is able to improve the outcome of browsing efficiency and user satisfaction on web search.

Download Full-text

Personalization of Web Search based on privacy protected and auto-constructed user profile

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2015.7275711 ◽

2015 ◽

Author(s):

Rasika M. Kaingade ◽

Hemant A. Tirmare

Keyword(s):

Web Search ◽

User Profile

Download Full-text

Are Topics Interesting or Not? An LDA-based Topic-graph Probabilistic Model for Web Search Personalization

ACM Transactions on Information Systems ◽

10.1145/3476106 ◽

2022 ◽

Vol 40 (3) ◽

pp. 1-24

Author(s):

Jiashu Zhao ◽

Jimmy Xiangji Huang ◽

Hongbo Deng ◽

Yi Chang ◽

Long Xia

Keyword(s):

Probabilistic Model ◽

Large Scale ◽

Web Search ◽

Latent Dirichlet Allocation ◽

State Of The Art ◽

User Profile ◽

New Approach ◽

Latent Topic ◽

Search History ◽

Search Logs

In this article, we propose a Latent Dirichlet Allocation– (LDA) based topic-graph probabilistic personalization model for Web search. This model represents a user graph in a latent topic graph and simultaneously estimates the probabilities that the user is interested in the topics, as well as the probabilities that the user is not interested in the topics. For a given query issued by the user, the webpages that have higher relevancy to the interested topics are promoted, and the webpages more relevant to the non-interesting topics are penalized. In particular, we simulate a user’s search intent by building two profiles: A positive user profile for the probabilities of the user is interested in the topics and a corresponding negative user profile for the probabilities of being not interested in the the topics. The profiles are estimated based on the user’s search logs. A clicked webpage is assumed to include interesting topics. A skipped (viewed but not clicked) webpage is assumed to cover some non-interesting topics to the user. Such estimations are performed in the latent topic space generated by LDA. Moreover, a new approach is proposed to estimate the correlation between a given query and the user’s search history so as to determine how much personalization should be considered for the query. We compare our proposed models with several strong baselines including state-of-the-art personalization approaches. Experiments conducted on a large-scale real user search log collection illustrate the effectiveness of the proposed models.

Download Full-text

User profile for personalized web search

2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) ◽

10.1109/fskd.2011.6019913 ◽

2011 ◽

Cited By ~ 7

Author(s):

Chunyan Liang

Keyword(s):

Web Search ◽

User Profile

Download Full-text