Data sharing, small science and institutional repositories

Results are presented from the Data Curation Profiles project research, on who is willing to share what data with whom and when. Emerging from scientists’ discussions on sharing are several dimensions suggestive of the variation in both what it means ‘to share’ and how these processes are carried out. This research indicates that data curation services will need to accommodate a wide range of subdisciplinary data characteristics and sharing practices. As part of a larger set of strategies emerging across academic institutions, institutional repositories (IRs) will contribute to the stewardship and mobilization of scientific research data for e-Research and learning. There will be particular types of data that can be managed well in an IR context when characteristics and practices are well understood. Findings from this study elucidate scientists’ views on ‘sharable’ forms of data—the particular representation that they view as most valued for reuse by others within their own research areas—and the anticipated duration for such reuse. Reported sharing incidents that provide insights into barriers to sharing and related concerns on data misuse are included.

Download Full-text

Práticas e percepções dos pesquisadores brasileiros sobre serviços de acesso aberto a dados de pesquisa | Practices and perceptions of Brazilian researchers on open access to research data

Liinc em Revista ◽

10.18617/liinc.v15i2.4771 ◽

2019 ◽

Vol 15 (2) ◽

Author(s):

Sonia Elisa Caregnato ◽

Samile Andrea de Souza Vanz ◽

Caterina Groposo Pavão ◽

Paula Caroline Jardim Schifino Passos ◽

Eduardo Borges ◽

...

Keyword(s):

Open Access ◽

Data Sharing ◽

Research Data ◽

Exploratory Analysis ◽

Data Reuse ◽

Institutional Repositories ◽

Open Research

RESUMO O artigo apresenta análise exploratória das práticas e das percepções a respeito do acesso aberto a dados de pesquisa embasada em dados coletados por meio de survey, realizada com pesquisadores brasileiros. As 4.676 respostas obtidas demonstram que, apesar do grande interesse pelo tema, evidenciado pela prevalência de variáveis relacionadas ao compartilhamento e ao uso de dados e aos repositórios institucionais, não há clareza por parte dos sujeitos sobre os principais tópicos relacionados. Conclui-se que, apesar da maioria dos pesquisadores afirmar que compartilha dados de pesquisa, a disponibilização desses dados de forma aberta e irrestrita ainda não é amplamente aceita.Palavras-chave: Dados Abertos de Pesquisa; Compartilhamento de Dados; Reuso de Dados.ABSTRACT This article presents an exploratory analysis of the practices and perceptions regarding open access to research data based on information collected by a survey with Brazilian researchers. The 4,676 responses show that, despite the great interest in the topic, evidenced by the prevalence of variables related to data sharing and use and to institutional repositories, there is no clarity on the part of the subjects on the main related topics. We conclude that, although the majority of the researchers share research data, the availability of this data in an open and unrestricted way is not yet widely accepted.Keywords: Open Research Data; Data Sharing; Data Reuse.

Download Full-text

PsychData – Experiences from 12 Years of Research Data Archiving

Septentrio Conference Series ◽

10.7557/5.3666 ◽

2015 ◽

Author(s):

Peter Weiland ◽

Ina Dehnhard

Keyword(s):

Data Sharing ◽

Large Scale ◽

Research Data ◽

Data Reuse ◽

German Research Foundation ◽

Cross Sectional ◽

Data Archiving ◽

Wide Range ◽

Domain Specific Knowledge ◽

Meta Analyses

See video of the presentation.The benefits of making research data permanently accessible through data archives is widely recognized: costs can be reduced by reusing existing data, research results can be compared and validated with results from archived studies, fraud can be more easily detected, and meta-analyses can be conducted. Apart from that, authors may gain recognition and reputation for producing the datasets. Since 2003, the accredited research data center PsychData (part of the Leibniz Institute for Psychology Information in Trier, Germany) documents and archives research data from all areas of psychology and related fields. In the beginning, the main focus was on datasets that provide a high potential for reuse, e.g. longitudinal studies, large-scale cross sectional studies, or studies that were conducted during historically unique conditions. Presently, more and more journal publishers and project funding agencies require researchers to archive their data and make them accessible for the scientific community. Therefore, PsychData also has to serve this need.In this presentation we report on our experiences in operating a discipline-specific research data archive in a domain where data sharing is met with considerable resistance. We will focus on the challenges for data sharing and data reuse in psychology, e.g.large amount of domain-specific knowledge necessary for data curationhigh costs for documenting the data because of a wide range on non-standardized measuressmall teams and little established infrastructures compared with the "big data" disciplinesstudies in psychology not designed for reuse (in contrast to the social sciences)data protectionresistance to sharing dataAt the end of the presentation, we will provide a brief outlook on DataWiz, a new project funded by the German Research Foundation (DFG). In this project, tools will be developed to support researchers in documenting their data during the research phase.

Download Full-text

DataStaR: Using the Semantic Web approach for Data Curation

International Journal of Digital Curation ◽

10.2218/ijdc.v6i2.197 ◽

2011 ◽

Vol 6 (2) ◽

pp. 209-221 ◽

Cited By ~ 6

Author(s):

Huda Khan ◽

Brian Caruso ◽

Jon Corson-Rikert ◽

Dianne Dietrich ◽

Brian Lowe ◽

...

Keyword(s):

Social Sciences ◽

Semantic Web ◽

Data Sharing ◽

Research Data ◽

Data Curation ◽

Data Staging ◽

Domain Specific ◽

Digital Infrastructure

In disciplines as varied as medicine, social sciences, and economics, data and their analyses are essential parts of researchers’ contributions to their respective fields. While sharing research data for review and analysis presents new opportunities for furthering research, capturing these data in digital forms and providing the digital infrastructure for sharing data and metadata pose several challenges. This paper reviews the motivations behind and design of the Data Staging Repository (DataStaR) platform that targets specific portions of the research data curation lifecycle: data and metadata capture and sharing prior to publication, and publication to permanent archival repositories. The goal of DataStaR is to support both the sharing and publishing of data while at the same time enabling metadata creation without imposing additional overheads for researchers and librarians. Furthermore, DataStaR is intended to provide cross-disciplinary support by being able to integrate different domain-specific metadata schemas according to researchers’ needs. DataStaR’s strategy of a usable interface coupled with metadata flexibility allows for a more scaleable solution for data sharing, publication, and metadata reuse.

Download Full-text

Open-access policy and data-sharing practice in UK academia

Journal of Information Science ◽

10.1177/0165551518823174 ◽

2019 ◽

Vol 46 (1) ◽

pp. 41-52 ◽

Cited By ~ 3

Author(s):

Yimei Zhu

Keyword(s):

Open Access ◽

Data Sharing ◽

Secondary Data ◽

Open Science ◽

Research Data ◽

Academic Disciplines ◽

Free Access ◽

Online Publishing ◽

Institutional Repositories ◽

Access To Data

Data sharing can be defined as the release of research data that can be used by others. With the recent open-science movement, there has been a call for free access to data, tools and methods in academia. In recent years, subject-based and institutional repositories and data centres have emerged along with online publishing. Many scientific records, including published articles and data, have been made available via new platforms. In the United Kingdom, most major research funders had a data policy and require researchers to include a ‘data-sharing plan’ when applying for funding. However, there are a number of barriers to the full-scale adoption of data sharing. Those barriers are not only technical, but also psychological and social. A survey was conducted with over 1800 UK-based academics to explore the extent of support of data sharing and the characteristics and factors associated with data-sharing practice. It found that while most academics recognised the importance of sharing research data, most of them had never shared or reused research data. There were differences in the extent of data sharing between different gender, academic disciplines, age and seniority. It also found that the awareness of Research Council UK’s (RCUK) Open-Access (OA) policy, experience of Gold and Green OA publishing, attitudes towards the importance of data sharing and experience of using secondary data were associated with the practice of data sharing. A small group of researchers used social media such as Twitter, blogs and Facebook to promote the research data they had shared online. Our findings contribute to the knowledge and understanding of open science and offer recommendations to academic institutions, journals and funding agencies.

Download Full-text

Introduction to the Special JeSLIB Issue on Data Curation in Practice

Journal of eScience Librarianship ◽

10.7191/jeslib.2021.1222 ◽

2021 ◽

Vol 10 (3) ◽

Author(s):

Cynthia Hudson Vitale ◽

Jake R. Carlson ◽

Hannah Hadley ◽

Lisa Johnston

Keyword(s):

Research Integrity ◽

Scientific Communication ◽

Research Data ◽

Data Curation ◽

Research Projects ◽

Academic Institutions ◽

Data Repositories ◽

Communication Processes ◽

Potential Impact

Research data curation is a set of scientific communication processes and activities that support the ethical reuse of research data and uphold research integrity. Data curators act as key collaborators with researchers to enrich the scholarly value and potential impact of their data through preparing it to be shared with others and preserved for the long term. This special issues focuses on practical data curation workflows and tools that have been developed and implemented within data repositories, scholarly societies, research projects, and academic institutions.

Download Full-text

Institutional Repositories and Research Data Curation in a Distributed Environment

Library Trends ◽

10.1353/lib.0.0029 ◽

2008 ◽

Vol 57 (2) ◽

pp. 191-201 ◽

Cited By ~ 46

Author(s):

Michael Witt

Keyword(s):

Research Data ◽

Data Curation ◽

Distributed Environment ◽

Institutional Repositories

Download Full-text

Why is getting credit for your data so hard?

ITM Web of Conferences ◽

10.1051/itmconf/20203301003 ◽

2020 ◽

Vol 33 ◽

pp. 01003

Author(s):

Wouter Haak ◽

Alberto Zigoni ◽

Helen Kardinaal-de Mooij ◽

Elena Zudilova-Seinstra

Keyword(s):

Data Sharing ◽

Research Data ◽

Data Repository ◽

Institutional Research ◽

Public Research ◽

Data Repositories ◽

Institutional Repositories ◽

Institutional Data ◽

Research Organizations ◽

Existing Data

Institutions, funding bodies, and national research organizations are pushing for more data sharing and FAIR data. Institutions typically implement data policies, frequently supported by an institutional data repository. Funders typically mandate data sharing. So where does this leave the researcher? How can researchers benefit from doing the additional work to share their data? In order to make sure that researchers and institutions get credit for sharing their data, the data needs to be tracked and attributed first. In this paper we investigated where the research data ended up for 11 research institutions, and how this data is currently tracked and attributed. Furthermore, we also analysed the gap between the research data that is currently in institutional repositories, and where their researchers truly share their data. We found that 10 out of 11 institutions have most of their public research data hosted outside of their own institution. Combined, they have 12% of their institutional research data published in the institutional data repositories. According to our data, the typical institution had 5% of their research data (median) published in the institutional repository, but there were 4 universities for which it was 10% or higher. By combining existing data-to-article graphs with existing article-to- researcher and article-to-institution graphs it becomes possible to increase tracking of public research data and therefore the visibility of researchers sharing their data typically by 17x. The tracking algorithm that was used to perform analysis and report on potential improvements has subsequently been implemented as a standard method in the Mendeley Data Monitor product. The improvement is most likely an under-estimate because, while the recall for datasets in institutional repositories is 100%, that is not the case for datasets published outside the institutions, so there are even more datasets still to be discovered.

Download Full-text

Biodiversidata: A Collaborative Initiative Towards Open Data Availability in Uruguay

Biodiversity Information Science and Standards ◽

10.3897/biss.3.37715 ◽

2019 ◽

Vol 3 ◽

Author(s):

Florencia Grattarola ◽

Daniel Pincheira-Donoso

Keyword(s):

Open Access ◽

Data Management ◽

Data Sharing ◽

Scientific Research ◽

Research Data ◽

Biodiversity Data ◽

Personal Property ◽

Incentive Structures ◽

Management Plans ◽

The Subject

Data-sharing has become a key component in the modern scientific era of large-scale research, with numerous advantages for both data collectors and users. However, data-sharing in Uruguay remains neglected given that major public sources of biodiversity information (government and academia) are not open-access. As a consequence, the patterns and drivers of biodiversity in this country remain poorly understood and so does our ability to manage and conserve its biodiversity. To overcome this critical gap, collaborative strategies are needed to communicate the importance and benefits of data openness, exchange and provide technical tools and training on all aspects of data management, sharing practices, focus on incentives, and motivation structures for data-holders. Here, we introduce the Biodiversidata initiative (www.biodiversidata.org) – a novel Uruguayan Consortium of Biodiversity Data. Biodiversidata is a collaboration among experts with the aim of improving the country’s biodiversity knowledge and the open-access of the vast resources they generate. Biodiversidata aims to collate the first comprehensive open-access database on Uruguay's whole biodiversity, to support advancements in scientific research and conservation actions. Currently, Biodiversidata consists of over 30 experts from across national and international institutions, studying diverse biodiversity groups. After less than two years, we have collected, curated and standardised a dataset of ~70,000 records of primary biodiversity data of tetrapod species – the first and most comprehensive open biodiversity database ever gathered for Uruguay to date. However, the process is hampered by multiple challenges: the lack of support for sampling of specimens and maintenance of collections has contributed to the situation were data are often perceived as personal property rather than collective resources; institutions have no plans or strategies directed to digitisation of their collections which actually places biodiversity data in Uruguay ‘at risk’ of being lost; the scarce governmental and academic incentive structures towards open scientific research relegates data-sharing to a personal decision; although scientists individually are willing to share their research data, the lack of data management plans within their research groups hampers the capacity to digitise the data and thus, to make them available; former initiatives aimed to create comprehensive biodiversity databases did not consider the balance between openness and gain for researchers, setting the subject of data-sharing more of an obligation than a path of promotion, which impacted negatively in the perception of scientist to open their data. the lack of support for sampling of specimens and maintenance of collections has contributed to the situation were data are often perceived as personal property rather than collective resources; institutions have no plans or strategies directed to digitisation of their collections which actually places biodiversity data in Uruguay ‘at risk’ of being lost; the scarce governmental and academic incentive structures towards open scientific research relegates data-sharing to a personal decision; although scientists individually are willing to share their research data, the lack of data management plans within their research groups hampers the capacity to digitise the data and thus, to make them available; former initiatives aimed to create comprehensive biodiversity databases did not consider the balance between openness and gain for researchers, setting the subject of data-sharing more of an obligation than a path of promotion, which impacted negatively in the perception of scientist to open their data. To overcome some of these challenges, we decided to direct Biodiversidata to individual researchers/experts and not institutions. We called them with the plan of collecting the maximum possible amount of data from vertebrate, invertebrate and plant species, use it to collaboratively generate impactful scientific research. An important aspect was that we requested data only to fit the premise of being primary biodiversity data (i.e., data records that document the occurrence of a species in space and time). This meant cleaning and standardising very heterogeneous information, from a variety of source types and formats, including updating scientific names and georeferentiating sampling locations. However, centralising the cleaning process allowed researchers to send their raw records without spending time cleaning them themselves and, as a consequence, enlarged the amount of data being collated. Collectively, Biodiversidata’s approach towards changing the culture of data-sharing practices has relied on the reinforcement of a scientific collaboration culture that benefits not only researchers at the individual level, but the progress of larger-scale issues as a whole. There is a long way to go on the subject of open research data in Uruguay, though, aiming strategies to people, capitalising data management and progressing with step-by-step rewards, is already showing some preliminary encouraging results.

Download Full-text

Guest editors' notes

IASSIST Quarterly ◽

10.29173/iq1026 ◽

2021 ◽

Vol 45 (3-4) ◽

Author(s):

Winny Nekesa Akullo ◽

Robert Stalone Buwule

Keyword(s):

Data Management ◽

Learning Outcomes ◽

Data Sharing ◽

Sustainable Development Goals ◽

Research Data ◽

Sub Saharan Africa ◽

Institutional Repositories ◽

Attitudes And Behaviors ◽

Development Goals ◽

Sub Saharan

This special issue has nine papers selected from the Africa Regional Workshop at Makerere University (Kampala, Uganda) on January 11th to 13th 2021. The first two papers relate to Research Data Management (RDM). The first one analyses the authorship, volume, visibility, and quality of publications on RDM in Sub-Saharan Africa. The analysis was done using bibliometrics focusing on RDM publications from, and on, Sub-Saharan Africa which are currently indexed in Google Scholar. The second article presents available open RDM resources for different data practitioners, particularly researchers and librarians at the University of Dodoma, in Tanzania. Some of the RDM resources discussed in this paper are Data Management Plan (DMP) and a data repository available for researchers to freely archive and share their research data with the local and international communities. The third paper highlights the data-sharing attitudes and behaviors of African data curators and data management experts. The paper compares data from an earlier study and analyses the new findings between the data sharing attitudes and behaviors between Africans and non-Africans. The fourth paper articulates the data literacy integration agenda and how it can catalyze the achievement of Sustainable development goals. The paper unpacks the role of data literacy in catalyzing the achievement of the Sustainable Development Goals (SDGs), challenges faced, and suggests recommendations to the challenges. It is however sad to note here that the author of this paper recently passed on 15th December 2021. May the good lord accord Gorreti an eternal rest. The fifth paper discourses the establishment of a data center at Mzuzu University Library in Malawi after the unfortunate fire outbreak of 2015 that destroyed the whole library. Interesting models are drawn in the paper like; the six-month process of restoring an interim library and the designing & construction of the new library in collaboration with the Virginia Technological School of Architecture & Design in the United States. The sixth paper goes further to examine the growth and development of institutional repositories in the East African Countries of Kenya, Tanzania and Uganda. The paper contextualizes and discusses in detail the drivers and barriers to the development of institutional repositories in East Africa such as: policy formulation, financial support, training, infrastructure, open access awareness among others. The seventh paper focuses on the learning outcomes in literacy and numeracy in Uganda in the light of maternal education. In this paper, deeper analysis was conducted on the data mined from the Uwezo assessment data to show the effect of the mothers’ education on the numeracy and literacy learning outcomes among children in Uganda. The eighth paper illuminates the opportunities and risks of sharing agricultural research data in Tanzania. Stimulating themes on sharing of research data are developed and discussed in this paper such as: research collaboration, transparency, accuracy, funding, policy, institutional, and government support among others. Finally, the ninth and last paper narrates the data dissemination process at the Uganda Bureau of Statistics (UBOS). The paper presents in detail the methods, channels of data sharing such as: workshops, websites, libraries, resource centres, social media, and the physical delivery of print resources to the UBOS partners and clients. Winny Nekesa Akullo and Robert Stalone Buwule

Download Full-text

Preparing a Data Archive or Repository for Changing Research Data and Materials Retention Policies

Journal of eScience Librarianship ◽

10.7191/jeslib.2021.1216 ◽

2021 ◽

Vol 10 (4) ◽

Author(s):

Jonathan Bohan ◽

Lynda Kellam

Keyword(s):

Social Sciences ◽

Research Data ◽

Data Archive ◽

Academic Institutions ◽

Traditional Role ◽

Institutional Repositories ◽

Retention Policies ◽

And Training

Archival expectations and requirements for researchers’ data and code are changing rapidly, both among publishers and institutions, in response to what has been referred to as a “reproducibility crisis.” In an effort to address this crisis, a number of publishers have added requirements or recommendations to increase the availability of supporting information behind the research, and academic institutions have followed. Librarians should focus on ways to make it easier for researchers to effectively share their data and code with reproducibility in mind. At the Cornell Center for Social Sciences, we have instituted a Results Reproduction Service (R-Squared) for Cornell researchers. Part of this service includes archiving the R-Squared package in our CoreTrustSeal certified Data and Reproduction Archive, which has been rebuilt to accommodate both the unique requirements of those packages and the traditional role of our data archive. Librarians need to consider roles that archives and institutional repositories can play in supporting researchers with reproducibility initiatives. Our commentary closes with some suggestions for more information and training.

Download Full-text