search logs Latest Research Papers

In this article, we propose a Latent Dirichlet Allocation– (LDA) based topic-graph probabilistic personalization model for Web search. This model represents a user graph in a latent topic graph and simultaneously estimates the probabilities that the user is interested in the topics, as well as the probabilities that the user is not interested in the topics. For a given query issued by the user, the webpages that have higher relevancy to the interested topics are promoted, and the webpages more relevant to the non-interesting topics are penalized. In particular, we simulate a user’s search intent by building two profiles: A positive user profile for the probabilities of the user is interested in the topics and a corresponding negative user profile for the probabilities of being not interested in the the topics. The profiles are estimated based on the user’s search logs. A clicked webpage is assumed to include interesting topics. A skipped (viewed but not clicked) webpage is assumed to cover some non-interesting topics to the user. Such estimations are performed in the latent topic space generated by LDA. Moreover, a new approach is proposed to estimate the correlation between a given query and the user’s search history so as to determine how much personalization should be considered for the query. We compare our proposed models with several strong baselines including state-of-the-art personalization approaches. Experiments conducted on a large-scale real user search log collection illustrate the effectiveness of the proposed models.

Download Full-text

Identifying Queries in Instant Search Logs

Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3404835.3463025 ◽

2021 ◽

Author(s):

Markus Fischer ◽

Kristof Komlossy ◽

Benno Stein ◽

Martin Potthast ◽

Matthias Hagen

Keyword(s):

Search Logs

Download Full-text

Impact of COVID-19 on search in an organisation

Journal of Information Science ◽

10.1177/0165551521989531 ◽

2021 ◽

pp. 016555152198953

Author(s):

Paul H Cleverley ◽

Fionnuala Cousins ◽

Simon Burnett

Keyword(s):

Information Search ◽

Information Needs ◽

Search Queries ◽

Enterprise Search ◽

Future Search ◽

Knowledge Intensive ◽

Search Design ◽

The Impact ◽

Search Logs

COVID-19 has created unprecedented organisational challenges, yet no study has examined the impact on information search. A case study in a knowledge-intensive organisation was undertaken on 2.5 million search queries during the pandemic. A surge of unique users and COVID-19 search queries in March 2020 may equate to ‘peak uncertainty and activity’, demonstrating the importance of corporate search engines in times of crisis. Search volumes dropped 24% after lockdowns; an ‘L-shaped’ recovery may be a surrogate for business activity. COVID-19 search queries transitioned from awareness, to impact, strategy, response and ways of working that may influence future search design. Low click through rates imply some information needs were not met and searches on mental health increased. In extreme situations (i.e. a pandemic), companies may need to move faster, monitoring and exploiting their enterprise search logs in real time as these reflect uncertainty and anxiety that may exist in the enterprise.

Download Full-text

Prediction of Online Purchasing Behavior of Cameras UsingWeb Search Logs

Transactions of the Japanese Society for Artificial Intelligence ◽

10.1527/tjsai.36-1_wi2-c ◽

2021 ◽

Vol 36 (1) ◽

pp. WI2-C_1-10

Author(s):

Yusei Nakata ◽

Naoki Muramoto ◽

Takehiro Yamamoto ◽

Sumio Fujita ◽

Hiroaki Ohshima

Keyword(s):

Purchasing Behavior ◽

Search Logs

Download Full-text

Evaluating Utility and Automatic Classification of Subject Metadata from Research Data Australia

KNOWLEDGE ORGANIZATION ◽

10.5771/0943-7444-2021-3-219 ◽

2021 ◽

Vol 48 (3) ◽

pp. 219-230

Author(s):

Mingfang Wu ◽

Ying-Hsang Liu ◽

Rowan Brownlee ◽

Xiuzhen Zhang

Keyword(s):

Quality Of Data ◽

Data Discovery ◽

Search Behaviour ◽

Subject Categories ◽

Automatic Methods ◽

Additional Support ◽

Search Logs

In this paper, we present a case study of how well subject metadata (comprising headings from an international classification scheme) has been deployed in a national data catalogue, and how often data seekers use subject metadata when searching for data. Through an analysis of user search behaviour as recorded in search logs, we find evidence that users utilise the subject metadata for data discovery. Since approximately half of the records ingested by the catalogue did not include subject metadata at the time of harvest, we experimented with automatic subject classification approaches in order to enrich these records and to provide additional support for user search and data discovery. Our results show that automatic methods work well for well represented categories of subject metadata, and these categories tend to have features that can distinguish themselves from the other categories. Our findings raise implications for data catalogue providers; they should invest more effort to enhance the quality of data records by providing an adequate description of these records for under-represented subject categories.

Download Full-text

Identifying comparable entities with indirectly associative relations and word embeddings from web search logs

Decision Support Systems ◽

10.1016/j.dss.2020.113465 ◽

2020 ◽

pp. 113465

Author(s):

Liye Wang ◽

Jin Zhang ◽

Guoqing Chen ◽

Dandan Qiao

Keyword(s):

Web Search ◽

Word Embeddings ◽

Search Logs

Download Full-text

An Empirical Study of Software Exceptions in the Field using Search Logs

Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) ◽

10.1145/3382494.3410692 ◽

2020 ◽

Author(s):

Foyzul Hassan ◽

Chetan Bansal ◽

Nachiappan Nagappan ◽

Thomas Zimmermann ◽

Ahmed Hassan Awadallah

Keyword(s):

Empirical Study ◽

Search Logs

Download Full-text

Studying Ransomware Attacks Using Web Search Logs

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3397271.3401189 ◽

2020 ◽

Cited By ~ 1

Author(s):

Chetan Bansal ◽

Pantazis Deligiannis ◽

Chandra Maddila ◽

Nikitha Rao

Keyword(s):

Web Search ◽

Search Logs

Download Full-text

Extracting interrogative intents and concepts from geo-analytic questions

AGILE: GIScience Series ◽

10.5194/agile-giss-1-23-2020 ◽

2020 ◽

Vol 1 ◽

pp. 1-21

Author(s):

Haiqi Xu ◽

Ehsan Hamzei ◽

Enkhbold Nyamsuren ◽

Han Kruiger ◽

Stephan Winter ◽

...

Keyword(s):

Web Search ◽

Question Answering ◽

General Purpose ◽

User Studies ◽

Factual Knowledge ◽

Semantic Structure ◽

Empirical Basis ◽

Semantic Concepts ◽

English Textbooks ◽

Search Logs

Abstract. Understanding syntactic and semantic structure of geographic questions is a necessary step towards true geographic question-answering (GeoQA) machines. The empirical basis for the understanding of the capabilities expected from GeoQA systems are geographic question corpora. Available corpora in English have been mostly drawn from generic Web search logs or limited user studies, supporting the focus of GeoQA systems on retrieving factoids: factual knowledge about particular places and everyday processes. Yet, the majority of questions enquired about in the spatial sciences go beyond simple place facts, with more complex analytical intents informing the questions. In this paper, we introduce a new corpus of geo-analytic questions drawn from English textbooks and scientific articles. We analyse and compare this corpus with two general-purpose GeoQA corpora in terms of grammatical complexity and semantic concepts, using a new parsing method that allows us to differentiate and quantify patterns of a question’s intent.

Download Full-text

Identifying the User As Genuine/Malign Based on Search Logs and Search History

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a2752.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 2046-2048

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Main Research ◽

Security Issues ◽

Specific Category ◽

System Logs ◽

Bayes Algorithm ◽

Search Logs

-One of the major challenges a developer may face is security issues/threats on the labelled data. The labelled data comprises of system logs, network traffic or any other enriched data with threat/not threat classification. . There were few studies which categorized the URLs to a specific category like Arts, Technology, etc. In this paper the main research is on the classification of users based on the search logs(URLs). Manually it is difficult to differentiate the user based on search logs. So, we train a machine learning model that takes raw data as input and classifies the user to genuine or malign. This model helps in intrusion detection/suspicious activity detection. For this first we gather data of past malicious URLS as training set for Naïve Bayes algorithm to detect the malicious users. By implementing KNN algorithm effectively we can detect the malign users up to an accuracy of 94.28%. With the help of Machine Learning algorithms like Naïve Bayes, KNN, Random Forest classifiers we can classify the malign and genuine users.

Download Full-text

search logs
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Are Topics Interesting or Not? An LDA-based Topic-graph Probabilistic Model for Web Search Personalization

Identifying Queries in Instant Search Logs

Impact of COVID-19 on search in an organisation

Prediction of Online Purchasing Behavior of Cameras UsingWeb Search Logs

Evaluating Utility and Automatic Classification of Subject Metadata from Research Data Australia

Identifying comparable entities with indirectly associative relations and word embeddings from web search logs

An Empirical Study of Software Exceptions in the Field using Search Logs

Studying Ransomware Attacks Using Web Search Logs

Extracting interrogative intents and concepts from geo-analytic questions

Identifying the User As Genuine/Malign Based on Search Logs and Search History

Export Citation Format

search logsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Are Topics Interesting or Not? An LDA-based Topic-graph Probabilistic Model for Web Search Personalization

Identifying Queries in Instant Search Logs

Impact of COVID-19 on search in an organisation

Prediction of Online Purchasing Behavior of Cameras UsingWeb Search Logs

Evaluating Utility and Automatic Classification of Subject Metadata from Research Data Australia

Identifying comparable entities with indirectly associative relations and word embeddings from web search logs

An Empirical Study of Software Exceptions in the Field using Search Logs

Studying Ransomware Attacks Using Web Search Logs

Extracting interrogative intents and concepts from geo-analytic questions

Identifying the User As Genuine/Malign Based on Search Logs and Search History

search logs
Recently Published Documents