scholarly journals Transparency, provenance and collections as data

2021 ◽  
Vol 31 (1) ◽  
pp. 1-13
Author(s):  
Sarah Ames

‘Collections as data’ has become a core activity for libraries in recent years: it is important that we make collections available in machine-readable formats to enable and encourage computational research. However, while this is a necessary output, discussion around the processes and workflows required to turn collections into data, and to make collections data available openly, are just as valuable. With libraries increasingly becoming producers of their own collections – presenting data from digitisation and digital production tools as part of datasets, for example – and making collections available at scale through mass-digitisation programmes, the trustworthiness of our processes comes into question. In a world of big data, often of unclear origins, how can libraries be transparent about the ways in which collections are turned into data, how do we ensure that biases in our collections are recognised and not amplified, and how do we make these datasets available openly for reuse? This paper presents a case study of work underway at the National Library of Scotland to present collections as data in an open and transparent way – from establishing a new Digital Scholarship Service, to workflows and online presentation of datasets. It considers the changes to existing processes needed to produce the Data Foundry, the National Library of Scotland's open data delivery platform, and explores the practical challenges of presenting collections as data online in an open, transparent and coherent manner.

IFLA Journal ◽  
2021 ◽  
pp. 034003522110654
Author(s):  
Sarah Ames ◽  
Lucy Havens

The National Library of Scotland’s Digital Scholarship Service has been releasing collections as data on its data-delivery platform, the Data Foundry, since September 2019. Following the COVID-19 lockdown, this service experienced significantly higher traffic, as library users increasingly made use of online resources. To ensure that as many users as possible were able to explore the datasets on the Data Foundry, the Library invested in a Digital Research Intern post, with a remit to provide introductory analysis of the Data Foundry collections using Jupyter Notebooks. This article provides a case study of this project, explaining the Library’s work to date around its new Digital Scholarship Service and releasing datasets on the Data Foundry; the reasoning behind the decision to begin to provide Jupyter Notebooks; the Notebooks themselves and what types of analysis they contain, as well as the challenges faced in creating them; and the publication and impact of the Notebooks.


2020 ◽  
Vol 7 (2) ◽  
pp. 205395172097057
Author(s):  
Sarah Ames ◽  
Stuart Lewis

With a mass digitisation programme underway and the addition of non-print legal deposit and web archive collections, the National Library of Scotland is now both producing and collecting data at an unprecedented rate, with over 5PB of storage in the Library’s data centres. As well as the opportunities to support large scale analysis of the collections, this also presents new challenges around data management, storage, rights, formats, skills and access. Furthermore, by assuming the role of both creators and collectors, libraries face broader questions about the concepts of ‘collections' and ‘heritage', and the ethical implications of collecting practices. While the ‘collections as data’ movement has encouraged cultural heritage organisations to present collections in machine-readable formats, new services, processes and tools also need to be established to enable these emerging forms of research, and new modes of working need to be established to take into account an increasing need for transparency around the creation and presentation of digital collections. This commentary explores the National Library of Scotland's new digital scholarship service, the implications of this new activity and the obstacles that libraries encounter when navigating a world of Big Data.


Author(s):  
Edgar Meij ◽  
Marc Bron ◽  
Laura Hollink ◽  
Bouke Huurnink ◽  
Maarten de Rijke
Keyword(s):  

2021 ◽  
Vol 46 (2) ◽  
pp. 57-63
Author(s):  
Lotte Wilms ◽  
Caleb Derven ◽  
Merisa Martinez

How can European library staff working in digital humanities connect with peers in the library sector, determine where to find relevant information about digital scholarship, provide their collections as data and to be an equal partner in digital humanities research? The LIBER Digital Humanities Working Group was created as a participatory knowledge network in 2017 to address these questions. Through a series of workshops, knowledge sharing activities, and a Europe-wide survey and resulting report, the Working Group engaged with the international LIBER DH community. Useful reflections are provided on organising an open, voluntary DH community and planning for inclusive activities that benefit digital scholarship in European research libraries.


2021 ◽  
Author(s):  
Oliver Benning ◽  
Jonathan Calles ◽  
Burak Kantarci ◽  
Shahzad Khan

This article presents a practical method for the assessment of the risk profiles of communities by tracking / acquiring, fusing and analyzing data from public transportation, district population distribution, passenger interactions and cross-locality travel data. The proposed framework fuses these data sources into a realistic simulation of a transit network for a given time span. By shedding credible insights into the impact of public transit on pandemic spread, the research findings will help to set the groundwork for tools that could provide pandemic response teams and municipalities with a robust framework for the evaluations of city districts most at risk, and how to adjust municipal services accordingly.


2021 ◽  
Vol 13 (5) ◽  
pp. 1905-1923
Author(s):  
Annalisa Minelli ◽  
Carmen Ferrà ◽  
Alessandra Spagnolo ◽  
Martina Scanu ◽  
Anna Nora Tassetti ◽  
...  

Abstract. The paper presents a database of information on wrecks, natural and artificial reefs located in the Adriatic Sea, collected within the framework of the Interreg Italy–Croatia project ADRIREEF – Innovative exploitation of Adriatic Reefs in order to strengthen Blue Economy. The data collection lasted more than 1 year and included three surveys and a wide literature review. After being collected, data were harmonized and, where possible, made machine-readable. Moreover, data were widely metadated, published in a WebGIS (https://adrireef.github.io/sandbox3/, last access: 3 May 2021), and shared as open data in EMODnet (European Marine Observation and Data Network) Data Ingestion Portal through the SEANOE repository (Ferrà et al., 2020; https://doi.org/10.17882/74880). The database is composed of 285 three-dimensional records, each one described by 51 attributes. Parameters are clustered in four main groups: identification, reef description, site description, and management/exploitation information. Available literature (scientific and/or grey) was also included in the database and linked to the corresponding site.


Sign in / Sign up

Export Citation Format

Share Document