ACM SIGIR Forum | ScienceGate

Neural methods for effective, efficient, and exposure-aware information retrieval

ACM SIGIR Forum ◽

10.1145/3476415.3476434 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-2

Author(s):

Bhaskar Mitra

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Large Scale ◽

Web Search ◽

Real Life ◽

Inverted Index ◽

Information Need ◽

Product Model ◽

Performance Improvements ◽

Deep Model

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms---such as a person's name or a product model number---not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections---such as the document index of a commercial Web search engine---containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks. We ground our contributions with a detailed survey of the growing body of neural IR literature [Mitra and Craswell, 2018]. Our key contribution towards improving the effectiveness of deep ranking models is developing the Duet principle [Mitra et al., 2017] which emphasizes the importance of incorporating evidence based on both patterns of exact term matches and similarities between learned latent representations of query and document. To efficiently retrieve from large collections, we develop a framework to incorporate query term independence [Mitra et al., 2019] into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. In the context of stochastic ranking, we further develop optimization strategies for exposure-based objectives [Diaz et al., 2020]. Finally, this dissertation also summarizes our contributions towards benchmarking neural IR models in the presence of large training datasets [Craswell et al., 2019] and explores the application of neural methods to other IR tasks, such as query auto-completion.

Download Full-text

Report on the CyCAT winter school on fairness, accountability, transparency and ethics (FATE) in AI

ACM SIGIR Forum ◽

10.1145/3476415.3476419 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-9

Author(s):

Styliani Kleanthous ◽

Jahna Otterbacher ◽

Jo Bates ◽

Fausto Giunchiglia ◽

Frank Hopfgartner ◽

...

Keyword(s):

Search Engines ◽

Interdisciplinary Collaboration ◽

Hands On ◽

Winter School ◽

Common Understanding ◽

Holistic Manner

The first FATE Winter School, organized by the Cyprus Center for Algorithmic Transparency (CyCAT) provided a forum for both students as well as senior researchers to examine the complex topic of Fairness, Accountability, Transparency and Ethics (FATE). Through a program that included two invited keynotes, as well as sessions led by CyCAT partners across Europe and Israel, participants were exposed to a range of approaches on FATE, in a holistic manner. During the Winter School, the team also organized a hands-on activity to evaluate a tool-based intervention where participants interacted with eight prototypes of bias-aware search engines. Finally, participants were invited to join one of four collaborative projects coordinated by CyCAT, thus furthering common understanding and interdisciplinary collaboration on this emerging topic.

Download Full-text

Report on the 11th bibliometric-enhanced information retrieval workshop (BIR 2021)

ACM SIGIR Forum ◽

10.1145/3476415.3476426 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-9

Author(s):

Ingo Frommholz ◽

Guillaume Cabanac ◽

Philipp Mayr ◽

Suzan Verberne

Keyword(s):

Information Retrieval ◽

Lessons Learned ◽

Future Research ◽

Research Questions ◽

Important Branch

The 11th Bibliometric-enhanced Information Retrieval Workshop (BIR 2021) was held online on April 1st, 2021, at ECIR 2021 as a virtual event. The interdisciplinary BIR workshop series aims to bring together researchers from different communities, especially Scientometrics/Bibliometrics and Information Retrieval. We report on the 11th BIR, its invited talks and accepted papers. Lessons learned from BIR 2021 are discussed and potential future research questions identified that position Bibliometric-enhanced IR as an exciting special yet important branch of IR research.

Download Full-text

Report on the ECIR 2021 discussion panel on open access

ACM SIGIR Forum ◽

10.1145/3476415.3476425 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-4

Author(s):

Djoerd Hiemstra

Keyword(s):

Information Retrieval ◽

Open Access ◽

Computational Linguistics ◽

Business Models ◽

Open Access Publishing ◽

Wednesday Morning ◽

Discussion Panel

On 31 March 2021, the Wednesday morning of ECIR 2021, the conference participants joined with seven panellists in a discussion on Open Access and Information Retrieval (IR), or more accurately, on the lack of open access publishing in IR. Discussion topics included the experience of researchers with open access in Africa; business models for open access, in particular how to run a sustainable open access conference like ECIR; open access plans at Springer, the BCS and the ACM; and finally, experience with open access publishing in related fields, notably in Computational Linguistics.

Download Full-text

Toward a fairer information retrieval system

ACM SIGIR Forum ◽

10.1145/3476415.3476429 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-2

Author(s):

Ruoyuan Gao

Keyword(s):

Information Retrieval ◽

Optimization Problems ◽

Solution Space ◽

Post Processing ◽

Unified Framework ◽

Real World Datasets ◽

New Perspective ◽

Evaluation Metric ◽

Google Search ◽

The Relationship

With the increasing popularity and social influence of information retrieval (IR) systems, various studies have raised concerns on the presence of bias in IR and the social responsibilities of IR systems. Techniques for addressing these issues can be classified into pre-processing , in-processing and post-processing. Pre-processing reduces bias in the data that is fed into machine learning models. In-processing encodes fairness constraints as a part of the objective function or learning process. Post-processing operates as a top layer over the trained model to reduce the presentation bias exposed to users. This dissertation explored ways to bring the pre-processing and post-processing approaches, together with the fairness-aware evaluation metrics, into a unified framework as an attempt to break the vicious cycle of bias and improve fairness in IR. We first investigated the existing bias presented in search engine results. Specifically, we focused on the top-k fairness ranking in terms of statistical parity fairness and disparate impact fairness definitions. With Google search and a general purposed text cluster as a lens, we explored several topical diversity fairness ranking strategies to understand the relationship between relevance and fairness in search results. Our experimental results showed that different fairness ranking strategies resulted in distinct utility scores and performed differently with distinct datasets. Second, to further investigate the relationship of data and fairness algorithms, we developed a statistical framework that was able to facilitate various analysis and decision making. Our framework could effectively and efficiently estimate the domain of data and solution space. We derived theoretical expressions to identify the fairness and relevance bounds for data of different distributions, and applied them to both synthetic datasets and real world datasets. We presented a series of use cases to demonstrate how our framework was applied to associate data and provide insights to fairness optimization problems. Third, we proposed an evaluation metric FAIR for the ranking results that encoded fairness, diversity, novelty and relevance. This metric offered a new perspective of evaluating fairness-aware ranking results. Based on this metric, we developed an effective ranking algorithm that jointly optimized for fairness and utility. Our experiments showed that our new metric was able to highlight results that achieved good user utility and fair information exposure at the same time. We showed how FAIR metric related to existing metrics through correlation analysis and case studies, and demonstrated the effectiveness of our FAIR-based algorithm.

Download Full-text

Report on the 30th the web conference 2021 (TheWebConf2021)

ACM SIGIR Forum ◽

10.1145/3476415.3476427 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-9

Author(s):

Aljaž Košmerlj ◽

Marko Grobelnik ◽

Jure Leskovec

Keyword(s):

Social Sciences ◽

Artificial Intelligence ◽

Knowledge Representation ◽

Mobile Computing ◽

Wide Spectrum ◽

The State ◽

Science Conference ◽

The Web

The Web Conference (TheWebConf2021) is the premier conputer science conference on the state of the Web and its related areas. It covers the Web from a wide spectrum of aspects, including technical areas of artificial intelligence, security, privacy, knowledge representation, mobile computing as well as social sciences, economics, policy, accessibility and others. The year 2021 marked the 30th edition that was organised in Ljubljana, Slovenia, but took place fully online due to the COVID-19 pandemic.

Download Full-text

Exploring strategies to prevent harm from web search

ACM SIGIR Forum ◽

10.1145/3476415.3476431 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-2

Author(s):

Steven Zimmerman

Keyword(s):

Web Search ◽

Cognitive Biases ◽

Intervention Strategy ◽

Cognitive Strategies ◽

Future Research ◽

Core Component ◽

Daily Lives ◽

Pure System ◽

Reduce Risk ◽

Harm Prevention

Web search, the process of seeking and finding information online, is an ubiquitous activity engrained in the lives of many individuals and much of broader society. This activity, which has brought many benefits to individuals and society, has also opened the door to many harms, such as echo chambers, loss of privacy and exposure to misinformation. Members of the information retrieval (IR) community now recognize the dangers of the search technologies commonplace in our daily lives. The upshot of this recognition are growing efforts to address these dangers by the IR community. These efforts focus heavily on system oriented solutions, but give limited focus on behavioural and cognitive biases and behaviours of the search and even less attention to interventions designed to address these biases and behaviours. As such, a theoretical framework is proposed, with behavioural and cognitive strategies as a core component of interactive Web search environments designed to minimize harm. Using the framework as the foundation, this thesis presents a number of offline and online studies to evaluate nudging , a popular intervention strategy rooted in the field of behavioural economics, and boosting , a successful intervention strategy from the cognitive sciences, as strategies to reduce risk of harm in Web search. The key takeaway from these studies being that both boosting and nudging should be considered as viable approaches for harm prevention in Web search environments, in addition to pure system and algorithmic solutions. Additional contributions of this thesis include methods of study design for the comparison of multiple paradigms that promote improved decision making, along with a set of evaluation metrics to measure the success of the IR system and user performance as they relate to the harms being prevented. Future research is needed to confirm the effectiveness of these strategies for other types of harms.

Download Full-text

Report on the CHIIR 2021 third workshop on evaluation of personalisation in information retrieval (WEPIR 2021)

ACM SIGIR Forum ◽

10.1145/3476415.3476422 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-11

Author(s):

Gareth J. F. Jones ◽

Nicholas J. Belkin ◽

Noriko Kando ◽

Gabriella Pasi

Keyword(s):

Information Retrieval ◽

The Third ◽

Information Interaction ◽

Further Development ◽

Common Understanding

The Third Workshop on Evaluation of Personalisation in Information Retrieval (WEPIR 2021) was held in conjunction with the ACM SIGIR Conference on Human Information Interaction & Retrieval (CHIIR 2021) in Canberra, Australia, as a virtual event. WEPIR 2021 followed on from the first and second WEPIRs held at CHIIR 2018 and 2019. The purpose of the workshop was again to bring together researchers from different backgrounds, interested in advancing the evaluation of personalisation in information retrieval. The workshop focused on further development of a common understanding of the challenges, requirements and practical limitations of personalisation in information retrieval and its evaluation.

Download Full-text

Report on supporting and understanding of conversational dialogues workshop (SUD 2021) at WSDM 2021

ACM SIGIR Forum ◽

10.1145/3476415.3476420 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-7

Author(s):

Debasis Ganguly ◽

Gareth J. F. Jones ◽

Procheta Sen ◽

Manisha Verma ◽

Dipasree Pal

Keyword(s):

Data Mining ◽

Information Retrieval ◽

Web Search ◽

Automated Methods ◽

The Web

This report describes the workshop on Supporting and Understanding of (multi-party) conversational Dialogues (SUD) organized as a part of the Web Search and Data Mining conference (WSDM) 2021. The aim of SUD workshop was to encourage researchers to investigate automated methods to analyze and understand conversations. We also discuss the release of a dataset that would be useful in IR research on conversations. The dataset was constructed to support the data challenge in SUD workshop and its precursor event - the Retrieval from Conversational Dialogues (RCD) track at the Forum of Information Retrieval and Evaluation (FIRE) 2020.

Download Full-text

The information retrieval anthology 2021

ACM SIGIR Forum ◽

10.1145/3476415.3476417 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-18

Author(s):

Martin Potthast ◽

Benno Stein ◽

Matthias Hagen

Keyword(s):

Information Retrieval ◽

Search Engine ◽

Full Text ◽

Use Cases ◽

Text Search ◽

Full Text Search ◽

Comprehensive Collection ◽

To Come

The Information Retrieval Anthology, IR Anthology for short, is an endeavor to create a comprehensive collection of metadata and full texts of IR-related publications. We report on its first release, the use cases it can serve, as well as the challenges lying ahead to develop it towards a resource that serves the IR community for years to come. The IR Anthology's metadata browser and full text search engine are available at IR.webis.de.

Download Full-text

ACM SIGIR Forum
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

Neural methods for effective, efficient, and exposure-aware information retrieval

Report on the CyCAT winter school on fairness, accountability, transparency and ethics (FATE) in AI

Report on the 11th bibliometric-enhanced information retrieval workshop (BIR 2021)

Report on the ECIR 2021 discussion panel on open access

Toward a fairer information retrieval system

Report on the 30th the web conference 2021 (TheWebConf2021)

Exploring strategies to prevent harm from web search

Report on the CHIIR 2021 third workshop on evaluation of personalisation in information retrieval (WEPIR 2021)

Report on supporting and understanding of conversational dialogues workshop (SUD 2021) at WSDM 2021

The information retrieval anthology 2021

Export Citation Format

ACM SIGIR ForumLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Association For Computing Machinery

Neural methods for effective, efficient, and exposure-aware information retrieval

Report on the CyCAT winter school on fairness, accountability, transparency and ethics (FATE) in AI

Report on the 11th bibliometric-enhanced information retrieval workshop (BIR 2021)

Report on the ECIR 2021 discussion panel on open access

Toward a fairer information retrieval system

Report on the 30th the web conference 2021 (TheWebConf2021)

Exploring strategies to prevent harm from web search

Report on the CHIIR 2021 third workshop on evaluation of personalisation in information retrieval (WEPIR 2021)

Report on supporting and understanding of conversational dialogues workshop (SUD 2021) at WSDM 2021

The information retrieval anthology 2021

ACM SIGIR Forum
Latest Publications