Report on the 2nd workshop on bridging the gap between information science, information retrieval and data science (BIRDS 2021)

The aim of the BIRDS workshop (Bridging the Gap between Information Science, Information Retrieval and Data Science) is to bring together the Data Science, Information Retrieval, Information Science and HCI communities. BIRDS 2021 is the second in a series of workshops and was held in conjunction with CHIIR 2021, following a successful event at SIGIR 2020. It consisted of a selection of accepted papers and invited talks. This article reports on BIRDS 2021, provides a discussion of topics, tasks, approaches and data sets and an outline of future directions for interdisciplinary work.

Download Full-text

BIRDS - Bridging the Gap between Information Science, Information Retrieval and Data Science

Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval ◽

10.1145/3397271.3401463 ◽

2020 ◽

Author(s):

Ingo Frommholz ◽

Haiming Liu ◽

Massimo Melucci

Keyword(s):

Information Retrieval ◽

Data Science ◽

Information Science ◽

Science Information

Download Full-text

A Google Sheet Add-on for Biodiversity Data Standardization and Sharing

Biodiversity Information Science and Standards ◽

10.3897/biss.4.59228 ◽

2020 ◽

Vol 4 ◽

Author(s):

José Augusto Salim ◽

Antonio Saraiva

Keyword(s):

Information Retrieval ◽

Data Sharing ◽

Information Science ◽

Data Sets ◽

Biodiversity Informatics ◽

Biodiversity Data ◽

Data Standardization ◽

Darwin Core ◽

Rest Api ◽

Biodiversity Information

For those biologists and biodiversity data managers who are unfamiliar with information science data practices of data standardization, the use of complex software to assist in the creation of standardized datasets can be a barrier to sharing data. Since the ratification of the Darwin Core Standard (DwC) (Darwin Core Task Group 2009) by the Biodiversity Information Standards (TDWG) in 2009, many datasets have been published and shared through a variety of data portals. In the early stages of biodiversity data sharing, the protocol Distributed Generic Information Retrieval (DiGIR), progenitor of DwC, and later the protocols BioCASe and TDWG Access Protocol for Information Retrieval (TAPIR) (De Giovanni et al. 2010) were introduced for discovery, search and retrieval of distributed data, simplifying data exchange between information systems. Although these protocols are still in use, they are known to be inefficient for transferring large amounts of data (GBIF 2017). Because of that, in 2011 the Global Biodiversity Information Facility (GBIF) introduced the Darwin Core Archive (DwC-A), which allows more efficient data transfer, and has become the preferred format for publishing data in the GBIF network. DwC-A is a structured collection of text files, which makes use of the DwC terms to produce a single, self-contained dataset. Many tools for assisting data sharing using DwC-A have been introduced, such as the Integrated Publishing Toolkit (IPT) (Robertson et al. 2014), the Darwin Core Archive Assistant (GBIF 2010) and the Darwin Core Archive Validator. Despite promoting and facilitating data sharing, many users have difficulties using such tools, mainly because of the lack of training in information science in the biodiversity curriculum (Convention on Biological Diversiity 2012, Enke et al. 2012). However, most users are very familiar with spreadsheets to store and organize their data, but the adoption of the available solutions requires data transformation and training in information science and more specifically, biodiversity informatics. For an example of how spreadsheets can simplify data sharing see Stoev et al. (2016). In order to provide a more "familiar" approach to data sharing using DwC-A, we introduce a new tool as a Google Sheet Add-on. The Add-on, called Darwin Core Archive Assistant Add-on can be installed in the user's Google Account from the G Suite MarketPlace and used in conjunction with the Google Sheets application. The Add-on assists the mapping of spreadsheet columns/fields to DwC terms (Fig. 1), similar to IPT, but with the advantage that it does not require the user to export the spreadsheet and import it into another software. Additionally, the Add-on facilitates the creation of a star schema in accordance with DwC-A, by the definition of a "CORE_ID" (e.g. occurrenceID, eventID, taxonID) field between sheets of a document (Fig. 2). The Add-on also provides an Ecological Metadata Language (EML) (Jones et al. 2019) editor (Fig. 3) with minimal fields to be filled in (i.e., mandatory fields required by IPT), and helps users to generate and share DwC-Archives stored in the user's Google Drive, which can be downloaded as a DwC-A or automatically uploaded to another public storage resource like a user's Zenodo Account (Fig. 4). We expect that the Google Sheet Add-on introduced here, in conjunction with IPT, will promote biodiversity data sharing in a standardized format, as it requires minimal training and simplifies the process of data sharing from the user's perspective, mainly for those users not familiar with IPT, but that historically have worked with spreadsheets. Although the DwC-A generated by the add-on still needs to be published using IPT, it does provide a simpler interface (i.e., spreadsheet) for mapping data sets to DwC than IPT. Even though the IPT includes many more features than the Darwin Core Assistant Add-on, we expect that the Add-on can be a "starting point" for users unfamiliar with biodiversity informatics before they move on to more advanced data publishing tools. On the other hand, Zenodo integration allows users to share and cite their standardized data sets without publishing them via IPT, which can be useful for users without access to an IPT installation. Additionally, we are working on new features and future releases will include the automatic generation of Global Unique Identifiers for shared records, the possibility of adding additional data standards and DwC extensions, integration with GBIF REST API and with IPT REST API.

Download Full-text

Computational linguistics in information science: Information retrieval (full-text or conceptual), automatic indexing, text abstraction, content analysis, information extraction, query languages: Bibliography

Journal of the American Society for Information Science ◽

10.1002/(sici)1097-4571(199603)47:3<247::aid-asi9>3.0.co;2-y ◽

1996 ◽

Vol 47 (3) ◽

pp. 247-249

Author(s):

Bella Hass Weinberg

Keyword(s):

Information Retrieval ◽

Content Analysis ◽

Information Extraction ◽

Computational Linguistics ◽

Full Text ◽

Information Science ◽

Query Languages ◽

Automatic Indexing ◽

Science Information ◽

Text Abstraction

Download Full-text

Theories, methods and current research on emotions in library and information science, information retrieval and human–computer interaction

Information Processing & Management ◽

10.1016/j.ipm.2010.09.001 ◽

2011 ◽

Vol 47 (4) ◽

pp. 575-592 ◽

Cited By ~ 119

Author(s):

Irene Lopatovska ◽

Ioannis Arapakis

Keyword(s):

Information Retrieval ◽

Human Computer Interaction ◽

Information Science ◽

Library And Information Science ◽

Science Information ◽

Computer Interaction

Download Full-text

Selection of one-dimensional sedimentation: models for on-line use

Water Science & Technology ◽

10.2166/wst.1995.0100 ◽

1995 ◽

Vol 31 (2) ◽

pp. 193-204 ◽

Cited By ~ 7

Author(s):

Koen Grijspeerdt ◽

Peter Vanrolleghem ◽

Willy Verstraete

Keyword(s):

Steady State ◽

Selection Criteria ◽

Data Sets ◽

Concentration Profiles ◽

A Posteriori ◽

One Dimensional ◽

On Line ◽

Dynamic Concentration ◽

Selection Of ◽

Modelling Task

A comparative study of several recently proposed one-dimensional sedimentation models has been made. This has been achieved by fitting these models to steady-state and dynamic concentration profiles obtained in a down-scaled secondary decanter. The models were evaluated with several a posteriori model selection criteria. Since the purpose of the modelling task is to do on-line simulations, the calculation time was used as one of the selection criteria. Finally, the practical identifiability of the models for the available data sets was also investigated. It could be concluded that the model of Takács et al. (1991) gave the most reliable results.

Download Full-text

A Dialectical Approach to Search Engine Evaluation

Libri ◽

10.1515/libri-2019-0142 ◽

2020 ◽

Vol 70 (3) ◽

pp. 227-237

Author(s):

Mahdi Zeynali-Tazehkandi ◽

Mohsen Nowkarizi

Keyword(s):

Information Retrieval ◽

Information Science ◽

Library And Information Science ◽

Related Literature ◽

Philosophical Foundations ◽

Dialectical Approach ◽

Retrieval Systems ◽

The World ◽

Information Retrieval Systems ◽

Oriented Approach

AbstractEvaluation of information retrieval systems is a fundamental topic in Library and Information Science. The aim of this paper is to connect the system-oriented and the user-oriented approaches to relevant philosophical schools. By reviewing the related literature, it was found that the evaluation of information retrieval systems is successful if it benefits from both system-oriented and user-oriented approaches (composite). The system-oriented approach is rooted in Parmenides’ philosophy of stability (immovable) which Plato accepts and attributes to the world of forms; the user-oriented approach is rooted in Heraclitus’ flux philosophy (motion) which Plato defers and attributes to the tangible world. Thus, using Plato’s theory is a comprehensive approach for recognizing the concept of relevance. The theoretical and philosophical foundations determine the type of research methods and techniques. Therefore, Plato’s dialectical method is an appropriate composite method for evaluating information retrieval systems.

Download Full-text

Real-time Approximation of Photometric Polygonal Lights

Proceedings of the ACM on Computer Graphics and Interactive Techniques ◽

10.1145/3384537 ◽

2020 ◽

Vol 3 (1) ◽

pp. 1-18

Author(s):

Christian Luksch ◽

Lukas Prost ◽

Michael Wimmer

Keyword(s):

Real Time ◽

Specular Reflection ◽

Near Field ◽

Measurement Data ◽

Data Sets ◽

Photometric Measurement ◽

Integration Technique ◽

Time Approximation ◽

Light Emitter ◽

Selection Of

We present a real-time rendering technique for photometric polygonal lights. Our method uses a numerical integration technique based on a triangulation to calculate noise-free diffuse shading. We include a dynamic point in the triangulation that provides a continuous near-field illumination resembling the shape of the light emitter and its characteristics. We evaluate the accuracy of our approach with a diverse selection of photometric measurement data sets in a comprehensive benchmark framework. Furthermore, we provide an extension for specular reflection on surfaces with arbitrary roughness that facilitates the use of existing real-time shading techniques. Our technique is easy to integrate into real-time rendering systems and extends the range of possible applications with photometric area lights.

Download Full-text

Musical Features that Aid Sleep

Musicae Scientiae ◽

10.1177/1029864920972161 ◽

2020 ◽

pp. 102986492097216

Author(s):

Gaelen Thomas Dickson ◽

Emery Schubert

Keyword(s):

Information Retrieval ◽

Online Survey ◽

Music Information Retrieval ◽

Rhythmic Activity ◽

Main Frequency ◽

Middle Range ◽

Major Mode ◽

Information Retrieval Methods ◽

Music Information ◽

Selection Of

Background: Music is thought to be beneficial as a sleep aid. However, little research has explicitly investigated the specific characteristics of music that aid sleep and some researchers assume that music described as generically sedative (slow, with low rhythmic activity) is necessarily conducive to sleep, without directly interrogating this assumption. This study aimed to ascertain the features of music that aid sleep. Method: As part of an online survey, 161 students reported the pieces of music they had used to aid sleep, successfully or unsuccessfully. The participants reported 167 pieces, some more often than others. Nine features of the pieces were analyzed using a combination of music information retrieval methods and aural analysis. Results: Of the pieces reported by participants, 78% were successful in aiding sleep. The features they had in common were that (a) their main frequency register was middle range frequencies; (b) their tempo was medium; (c) their articulation was legato; (d) they were in the major mode, and (e) lyrics were present. They differed from pieces that were unsuccessful in aiding sleep in that (a) their main frequency register was lower; (b) their articulation was legato, and (c) they excluded high rhythmic activity. Conclusion: Music that aids sleep is not necessarily sedative music, as defined in the literature, but some features of sedative music are associated with aiding sleep. In the present study, we identified the specific features of music that were reported to have been successful and unsuccessful in aiding sleep. The identification of these features has important implications for the selection of pieces of music used in research on sleep.

Download Full-text

Interpreting testing and assessment: A state-of-the-art review

Language Testing ◽

10.1177/02655322211036100 ◽

2021 ◽

pp. 026553222110361

Author(s):

Chao Han

Keyword(s):

State Of The Art ◽

Professional Certification ◽

Automatic Assessment ◽

Prospective Students ◽

Assessment Practice ◽

High Stakes ◽

Future Directions ◽

Assessment Tasks ◽

Interpreter Education ◽

Selection Of

Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the selection of prospective students, the certification of interpreters, and the confirmation/refutation of research hypotheses. However, few reviews exist providing a comprehensive mapping of relevant practice and research. The present article therefore aims to offer a state-of-the-art review, summarizing the existing literature and discovering potential lacunae. In particular, the article first provides an overview of interpreting ability/competence and relevant research, followed by main testing and assessment practice (e.g., assessment tasks, assessment criteria, scoring methods, specificities of scoring operationalization), with a focus on operational diversity and psychometric properties. Second, the review describes a limited yet steadily growing body of empirical research that examines rater-mediated interpreting assessment, and casts light on automatic assessment as an emerging research topic. Third, the review discusses epistemological, psychometric, and practical challenges facing interpreting testers. Finally, it identifies future directions that could address the challenges arising from fast-changing pedagogical, educational, and professional landscapes.

Download Full-text

The predictability of reported drought events and impacts in the Ebro Basin using six different remote sensing data sets

Hydrology and Earth System Sciences ◽

10.5194/hess-21-4747-2017 ◽

2017 ◽

Vol 21 (9) ◽

pp. 4747-4765 ◽

Cited By ~ 8

Author(s):

Clara Linés ◽

Micha Werner ◽

Wim Bastiaanssen

Keyword(s):

Remote Sensing ◽

Remote Sensing Data ◽

Data Sets ◽

Drought Management ◽

Ebro Basin ◽

Sensing Data ◽

Drought Impacts ◽

Management Plans ◽

Drought Indicators ◽

Selection Of

Abstract. The implementation of drought management plans contributes to reduce the wide range of adverse impacts caused by water shortage. A crucial element of the development of drought management plans is the selection of appropriate indicators and their associated thresholds to detect drought events and monitor the evolution. Drought indicators should be able to detect emerging drought processes that will lead to impacts with sufficient anticipation to allow measures to be undertaken effectively. However, in the selection of appropriate drought indicators, the connection to the final impacts is often disregarded. This paper explores the utility of remotely sensed data sets to detect early stages of drought at the river basin scale and determine how much time can be gained to inform operational land and water management practices. Six different remote sensing data sets with different spectral origins and measurement frequencies are considered, complemented by a group of classical in situ hydrologic indicators. Their predictive power to detect past drought events is tested in the Ebro Basin. Qualitative (binary information based on media records) and quantitative (crop yields) data of drought events and impacts spanning a period of 12 years are used as a benchmark in the analysis. Results show that early signs of drought impacts can be detected up to 6 months before impacts are reported in newspapers, with the best correlation–anticipation relationships for the standard precipitation index (SPI), the normalised difference vegetation index (NDVI) and evapotranspiration (ET). Soil moisture (SM) and land surface temperature (LST) offer also good anticipation but with weaker correlations, while gross primary production (GPP) presents moderate positive correlations only for some of the rain-fed areas. Although classical hydrological information from water levels and water flows provided better anticipation than remote sensing indicators in most of the areas, correlations were found to be weaker. The indicators show a consistent behaviour with respect to the different levels of crop yield in rain-fed areas among the analysed years, with SPI, NDVI and ET providing again the stronger correlations. Overall, the results confirm remote sensing products' ability to anticipate reported drought impacts and therefore appear as a useful source of information to support drought management decisions.

Download Full-text