Social science data repositories in data deluge

Purpose Owing to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The open archival information system (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories. Considering that OAIS is a reference model that requires customization for actual practice, this paper aims to examine how the current practices in a data repository map to the OAIS environment and functional components. Design/methodology/approach The authors conducted two focus-group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR). By examining their current actions (activities regarding their work responsibilities) and IT practices, they studied the barriers and challenges of archiving and curating qualitative data at ICPSR. Findings The authors observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries. On the other hand, they find that the cost of preventing disclosure risk and a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing. Originality/value The authors evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. They also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be and the associated challenges that accompany these ideal technologies. Most importantly, they helped to prioritize challenges and barriers from the data curator’s perspective and to contribute implications of data sharing and reuse in social sciences.

Download Full-text

Sharing qualitative research data, improving data literacy and establishing national data services

IASSIST Quarterly ◽

10.29173/iq972 ◽

2020 ◽

Vol 43 (4) ◽

pp. 1-2

Author(s):

Karsten Boye Rasmussen

Keyword(s):

Social Science ◽

Data Sharing ◽

Quantitative Data ◽

Qualitative Data ◽

Research Data ◽

National Data ◽

Science Data ◽

Data Infrastructure ◽

Data Literacy ◽

Social Science Data

Welcome to the fourth issue of volume 43 of the IASSIST Quarterly (IQ 43:4, 2019). The first article is authored by Jessica Mozersky, Heidi Walsh, Meredith Parsons, Tristan McIntosh, Kari Baldwin, and James M. DuBois – all located at the Bioethics Research Center, Washington University School of Medicine, St. Louis, Missouri in USA. They ask the question “Are we ready to share qualitative research data?”, with the subtitle “Knowledge and preparedness among qualitative researchers, IRB Members, and data repository curators.” The subtitle indicates that their research includes a survey of key personnel related to scientific data sharing. The report is obtained through semi-structured in-depth interviews with 30 data repository curators, 30 qualitative researchers, and 30 IRB staff members in the USA. IRB stands for Institutional Review Board, which in other countries might be called research ethics committee or similar. There is generally an increasing trend towards data sharing and open science, but qualitative data are rarely shared. The dilemma behind this reluctance to share is exemplified by health data where qualitative methods explore sensitive topics. The sensitivity leads to protection of confidentiality, which hinders keeping sufficient contextual detail for secondary analyses. You could add that protection of confidentiality is a much bigger task in qualitative data, where sensitive information can be hidden in every corner of the data, that consequently must be fine-combed, while with quantitative data most decisions concerning confidentiality can be made at the level of variables. The reporting in the article gives insights into the differences between the three stakeholder groups. An often-found answer among researchers is that data sharing is associated with quantitative data, while IRB members have little practice with qualitative. Among curators, about half had curated qualitative data, but many only worked with quantitative data. In general, qualitative data sharing lacks guidance and standards. The second article also raises a question: “How many ways can we teach data literacy?” We are now in Asia with a connection to the USA. The author Yun Dai is working at the Library of New York University Shanghai, where they have explored many ways to teach data literacy to undergraduate students. These initiatives, described in the article, included workshops and in-class instruction - which tempted students by offering up-to-date technology, through online casebooks of topics in the data lifecycle, to event series with appealing names like “Lying with Data.” The event series had a marketing mascot - a “Lying with Data” Pinocchio - and sessions on being fooled by advertisements and getting the truth out of opinion surveys. Data literacy has a resemblance to information literacy and in that perspective, data literacy is defined as “critical thinking applied to evaluating data sources and formats, and interpreting and communicating findings,” while statistical literacy is “the ability to evaluate statistical information as evidence.” The article presents the approaches and does not conclude on the question, “How many?” No readers will be surprised by the missing answer, and I am certain readers will enjoy the ideas of the article and the marketing focus. With the last article “Examining barriers for establishing a national data service,” the author Janez Štebe takes us to Europe. Janez Štebe is head of the social science data archives (Arhiv Družboslovnih Podatkov) at the University of Ljubljana, Slovenia. The Consortium of European Social Science Data Archives (CESSDA) is a distributed European social science data infrastructure for access to research data. CESSDA has many - but not all - European countries as members. The focus is on the situation in 20 non-CESSDA member European countries, with emerging and immature data archive services being developed through such projects as the CESSDA Strengthening and Widening (SaW 2016 and 2017) and CESSDA Widening Activities (WA 2018). By identifying and comparing gaps and differences, a group of countries at a similar level may consider following similar best practice examples to achieve a more mature and supportive open scientific data ecosystem. Like the earlier articles, this article provides good references to earlier literature and description of previous studies in the area. In this project 22 countries were selected, all CESSDA non-members, and interviewees among social science researchers and data librarians were contacted with an e-mail template between October 2018 and January 2019. The article brings results and discussion of the national data sharing culture and data infrastructure. Yes, there is a lack of money! However, it is the process of gradually establishing a robust data infrastructure that is believed to impact the growth of a data sharing culture and improve the excellence and the efficiency of research in general. Submissions of papers for the IASSIST Quarterly are always very welcome. We welcome input from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing such a presentation, give a thought to turning your one-time presentation into a lasting contribution. Doing that after the event also gives you the opportunity of improving your work after feedback. We encourage you to login or create an author login to https://www.iassistquarterly.com (our Open Journal System application). We permit authors to “deep link” into the IQ as well as to deposit the paper in your local repository. Chairing a conference session with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the limited number of session participants and will be readily available on the IASSIST Quarterly website at https://www.iassistquarterly.com. Authors are very welcome to take a look at the instructions and layout: https://www.iassistquarterly.com/index.php/iassist/about/submissions Authors can also contact me directly via e-mail: [email protected]. Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you. Karsten Boye Rasmussen - December 2019

Download Full-text

Internet researchers’ data sharing behaviors

Online Information Review ◽

10.1108/oir-10-2016-0313 ◽

2018 ◽

Vol 42 (1) ◽

pp. 124-142 ◽

Cited By ~ 8

Author(s):

Youngseek Kim ◽

Seungahn Nah

Keyword(s):

Social Norms ◽

Data Sharing ◽

Positive Impact ◽

Scientific Data ◽

Data Reuse ◽

Equation Modeling ◽

Data Repository ◽

Data Repositories ◽

Content Type ◽

And Behaviors

Purpose The purpose of this paper is to examine how data reuse experience, attitudinal beliefs, social norms, and resource factors influence internet researchers to share data with other researchers outside their teams. Design/methodology/approach An online survey was conducted to examine the extent to which data reuse experience, attitudinal beliefs, social norms, and resource factors predicted internet researchers’ data sharing intentions and behaviors. The theorized model was tested using a structural equation modeling technique to analyze a total of 201 survey responses from the Association of Internet Researchers mailing list. Findings Results show that data reuse experience significantly influenced participants’ perception of benefit from data sharing and participants’ norm of data sharing. Belief structures regarding data sharing, including perceived career benefit and risk, and perceived effort, had significant associations with attitude toward data sharing, leading internet researchers to have greater data sharing intentions and behavior. The results also reveal that researchers’ norms for data sharing had a direct effect on data sharing intention. Furthermore, the results indicate that, while the perceived availability of data repository did not yield a positive impact on data sharing intention, it has a significant, direct, positive impact on researchers’ data sharing behaviors. Research limitations/implications This study validated its novel theorized model based on the theory of planned behavior (TPB). The study showed a holistic picture of how different data sharing factors, including data reuse experience, attitudinal beliefs, social norms, and data repositories, influence internet researchers’ data sharing intentions and behaviors. Practical implications Data reuse experience, attitude toward and norm of data sharing, and the availability of data repository had either direct or indirect influence on internet researchers’ data sharing behaviors. Thus, professional associations, funding agencies, and academic institutions alike should promote academic cultures that value data sharing in order to create a virtuous cycle of reciprocity and encourage researchers to have positive attitudes toward/norms of data sharing; these cultures should be strengthened by the strong support of data repositories. Originality/value In line with prior scholarship concerning scientific data sharing, this study of internet researchers offers a map of scientific data sharing intentions and behaviors by examining the impacts of data reuse experience, attitudinal beliefs, social norms, and data repositories together.

Download Full-text

Are data repositories fettered? A survey of current practices, challenges and future technologies

Online Information Review ◽

10.1108/oir-04-2021-0204 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Nushrat Khan ◽

Mike Thelwall ◽

Kayvan Kousha

Keyword(s):

Peer Review ◽

Online Survey ◽

Secondary Data ◽

Perceived Usefulness ◽

Data Reuse ◽

Data Repository ◽

Data Repositories ◽

Content Type ◽

Quality Checks ◽

Current Practices

PurposeThe purpose of this study is to explore current practices, challenges and technological needs of different data repositories.Design/methodology/approachAn online survey was designed for data repository managers, and contact information from the re3data, a data repository registry, was collected to disseminate the survey.FindingsIn total, 189 responses were received, including 47% discipline specific and 34% institutional data repositories. A total of 71% of the repositories reporting their software used bespoke technical frameworks, with DSpace, EPrint and Dataverse being commonly used by institutional repositories. Of repository managers, 32% reported tracking secondary data reuse while 50% would like to. Among data reuse metrics, citation counts were considered extremely important by the majority, followed by links to the data from other websites and download counts. Despite their perceived usefulness, repository managers struggle to track dataset citations. Most repository managers support dataset and metadata quality checks via librarians, subject specialists or information professionals. A lack of engagement from users and a lack of human resources are the top two challenges, and outreach is the most common motivator mentioned by repositories across all groups. Ensuring findable, accessible, interoperable and reusable (FAIR) data (49%), providing user support for research (36%) and developing best practices (29%) are the top three priorities for repository managers. The main recommendations for future repository systems are as follows: integration and interoperability between data and systems (30%), better research data management (RDM) tools (19%), tools that allow computation without downloading datasets (16%) and automated systems (16%).Originality/valueThis study identifies the current challenges and needs for improving data repository functionalities and user experiences.Peer reviewThe peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-04-2021-0204

Download Full-text

For Want of a Nail: Three Tropes in Data Curation

Preservation Digital Technology & Culture ◽

10.1515/pdtc-2015-0019 ◽

2015 ◽

Vol 44 (4) ◽

pp. 161-170

Author(s):

Kalpana Shankar

Keyword(s):

Public Sector ◽

Social Science ◽

Digital Preservation ◽

Data Curation ◽

Professional Values ◽

Shared Meaning ◽

Science Data ◽

Professional Literature ◽

Social Science Data

AbstractThis article explores the role of three key tropes in the data curation profession. Using interviews with digital preservation experts, researchers, public sector statisticians, and social science data archivists as well as popular and professional literature and media, this article discusses how tropes and narratives are used to create shared meaning among data curation stakeholders. The article explores how tropes of abundance / overload, openness, and trust are created and used and concludes with reflections on how such stories articulate professional values and concerns. The article advocates for further attention to the use of narratives and stories as the data curation profession develops.

Download Full-text

Researcher attitudes toward data sharing in public data repositories: a meta-evaluation of studies on researcher data sharing

Journal of Documentation ◽

10.1108/jd-01-2021-0015 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Jennifer L. Thoegersen ◽

Pia Borlund

Keyword(s):

Data Sharing ◽

Information Science ◽

Research Literature ◽

Data Repository ◽

Library And Information Science ◽

Data Repositories ◽

Content Type ◽

Address Data ◽

Public Data ◽

Term Data

PurposeThe purpose of this paper is to report a study of how research literature addresses researchers' attitudes toward data repository use. In particular, the authors are interested in how the term data sharing is defined, how data repository use is reported and whether there is need for greater clarity and specificity of terminology.Design/methodology/approachTo study how the literature addresses researcher data repository use, relevant studies were identified by searching Library Information Science and Technology Abstracts, Library and Information Science Source, Thomas Reuters' Web of Science Core Collection and Scopus. A total of 62 studies were identified for inclusion in this meta-evaluation.FindingsThe study shows a need for greater clarity and consistency in the use of the term data sharing in future studies to better understand the phenomenon and allow for cross-study comparisons. Furthermore, most studies did not address data repository use specifically. In most analyzed studies, it was not possible to segregate results relating to sharing via public data repositories from other types of sharing. When sharing in public repositories was mentioned, the prevalence of repository use varied significantly.Originality/valueResearchers' data sharing is of great interest to library and information science research and practice to inform academic libraries that are implementing data services to support these researchers. This study explores how the literature approaches this issue, especially the use of data repositories, the use of which is strongly encouraged. This paper identifies the potential for additional study focused on this area.

Download Full-text

A guide for selecting statistical techniques for analyzing social science data

Social Science Information Studies ◽

10.1016/0143-6236(83)90010-8 ◽

1983 ◽

Vol 3 (1) ◽

pp. 64-65

Author(s):

T.D. Wilson

Keyword(s):

Social Science ◽

Statistical Techniques ◽

Science Data ◽

Social Science Data

Download Full-text

Social Science Data and the Courts

Educational Researcher ◽

10.3102/0013189x005005011 ◽

1976 ◽

Vol 5 (5) ◽

pp. 11-13

Author(s):

PATRICIA E. STIVERS

Keyword(s):

Social Science ◽

Science Data ◽

Social Science Data

Download Full-text

Uncomfortable Decisions

10.31234/osf.io/tmsw9 ◽

2022 ◽

Author(s):

Paul Bloom ◽

Laurie Paul

Keyword(s):

Decision Making ◽

Social Science ◽

Career Change ◽

Social Science Research ◽

Science Research ◽

Science Data ◽

Decision Making Processes ◽

Social Science Data ◽

Personal Decision ◽

Algorithmic Process

Some decision-making processes are uncomfortable. Many of us do not like to make significant decisions, such as whether to have a child, solely based on social science research. We do not like to choose randomly, even in cases where flipping a coin is plainly the wisest choice. We are often reluctant to defer to another person, even if we believe that the other person is wiser, and have similar reservations about appealing to powerful algorithms. And, while we are comfortable with considering and weighing different options, there is something strange about deciding solely on a purely algorithmic process, even one that takes place in our own heads.What is the source of our discomfort? We do not present a decisive theory here—and, indeed, the authors have clashing views over some of these issues—but we lay out the arguments for two (consistent) explanations. The first is that such impersonal decision-making processes are felt to be a threat to our autonomy. In all of the examples above, it is not you who is making the decision, it is someone or something else. This is to be contrasted with personal decision-making, where, to put it colloquially, you “own” your decision, though of course you may be informed by social science data, recommendations of others, and so on. A second possibility is that such impersonal decision-making processes are not seen as authentic, where authentic decision making is one in which you intentionally and knowledgably choose an option in a way that is “true to yourself.” Such decision making can be particularly important in contexts where one is making a life-changing decision of great import, such as the choice to emigrate, start a family, or embark on a major career change.

Download Full-text