scholarly journals Large scale research data archiving: Training for an inconvenient technology

2019 ◽  
Vol 36 ◽  
pp. 100523
Author(s):  
S. Patrick Calhoun ◽  
David Akin ◽  
Brett Zimmerman ◽  
Henry Neeman
2015 ◽  
Author(s):  
Peter Weiland ◽  
Ina Dehnhard

See video of the presentation.The benefits of making research data permanently accessible through data archives is widely recognized: costs can be reduced by reusing existing data, research results can be compared and validated with results from archived studies, fraud can be more easily detected, and meta-analyses can be conducted. Apart from that, authors may gain recognition and reputation for producing the datasets. Since 2003, the accredited research data center PsychData (part of the Leibniz Institute for Psychology Information in Trier, Germany) documents and archives research data from all areas of psychology and related fields. In the beginning, the main focus was on datasets that provide a high potential for reuse, e.g. longitudinal studies, large-scale cross sectional studies, or studies that were conducted during historically unique conditions. Presently, more and more journal publishers and project funding agencies require researchers to archive their data and make them accessible for the scientific community. Therefore, PsychData also has to serve this need.In this presentation we report on our experiences in operating a discipline-specific research data archive in a domain where data sharing is met with considerable resistance. We will focus on the challenges for data sharing and data reuse in psychology, e.g.large amount of domain-specific knowledge necessary for data curationhigh costs for documenting the data because of a wide range on non-standardized measuressmall teams and little established infrastructures compared with the "big data" disciplinesstudies in psychology not designed for reuse (in contrast to the social sciences)data protectionresistance to sharing dataAt the end of the presentation, we will provide a brief outlook on DataWiz, a new project funded by the German Research Foundation (DFG). In this project, tools will be developed to support researchers in documenting their data during the research phase.


2016 ◽  
Vol Volume 112 (Number 7/8) ◽  
Author(s):  
Margaret M. Koopman ◽  
Karin de Jager ◽  
◽  

Abstract Digital data archiving and research data management have become increasingly important for institutions in South Africa, particularly after the announcement by the National Research Foundation, one of the principal South African academic research funders, recommending these actions for the research that they fund. A case study undertaken during the latter half of 2014, among the biological sciences researchers at a South African university, explored the state of data management and archiving at this institution and the readiness of researchers to engage with sharing their digital research data through repositories. It was found that while some researchers were already engaged with digital data archiving in repositories, neither researchers nor the university had implemented systematic research data management.


2021 ◽  
Author(s):  
Anita Bandrowski ◽  
Jeffrey S. Grethe ◽  
Anna Pilko ◽  
Tom Gillespie ◽  
Gabi Pine ◽  
...  

AbstractThe NIH Common Fund’s Stimulating Peripheral Activity to Relieve Conditions (SPARC) initiative is a large-scale program that seeks to accelerate the development of therapeutic devices that modulate electrical activity in nerves to improve organ function. Integral to the SPARC program are the rich anatomical and functional datasets produced by investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural and functional connectivity, mapping of cell types and molecular profiling. These datasets are provided to the research community through an open data platform, the SPARC Portal. To ensure SPARC datasets are Findable, Accessible, Interoperable and Reusable (FAIR), they are all submitted to the SPARC portal following a standard scheme established by the SPARC Curation Team, called the SPARC Data Structure (SDS). Inspired by the Brain Imaging Data Structure (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are coming from all fields of biomedical research. Here we present the rationale and design of the SDS, including a description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. The objective is to provide detailed guidelines for anyone desiring to comply with the SDS. Since the SDS are suitable for any type of biomedical research data, it can be adopted by any group desiring to follow the FAIR data principles for managing their data, even outside of the SPARC consortium. Finally, this manuscript provides a foundational framework that can be used by any organization desiring to either adapt the SDS to suit the specific needs of their data or simply desiring to design their own FAIR data sharing scheme from scratch.


2003 ◽  
Vol 1836 (1) ◽  
pp. 111-117
Author(s):  
Taek M. Kwon ◽  
Nirish Dhruv ◽  
Siddharth A. Patwardhan ◽  
Eil Kwon

Intelligent transportation system (ITS) sensor networks, such as road weather information and traffic sensor networks, typically generate enormous amounts of data. As a result, archiving, retrieval, and exchange of ITS sensor data for planning and performance analysis are becoming increasingly difficult. An efficient ITS archiving system that is compact and exchangeable and allows efficient and fast retrieval of large amounts of data is essential. A proposal is made for a system that can meet the present and future archiving needs of large-scale ITS data. This system is referred to as common data format (CDF) and was developed by the National Space Science Data Center for archiving, exchange, and management of large-scale scientific array data. CDF is an open system that is free and portable and includes self-describing data abstraction. Archiving traffic data by using CDF is demonstrated, and its archival and retrieval performance is presented for the Minnesota Department of Transportation–s 30-s traffic data collected from about 4,000 loop detectors around Twin Cities freeways. For comparison of the archiving performance, the same data were archived by using a commercially available relational database, which was evaluated for its archival and retrieval performance. This result is presented, along with reasons that CDF is a good fit for large-scale ITS data archiving, retrieval, and exchange of data.


2011 ◽  
Vol 7 (S285) ◽  
pp. 340-341
Author(s):  
Dayton L. Jones ◽  
Kiri Wagstaff ◽  
David Thompson ◽  
Larry D'Addario ◽  
Robert Navarro ◽  
...  

AbstractThe detection of fast (< 1 second) transient signals requires a challenging balance between the need to examine vast quantities of high time-resolution data and the impracticality of storing all the data for later analysis. This is the epitome of a “big data” issue—far more data will be produced by next generation-astronomy facilities than can be analyzed, distributed, or archived using traditional methods. JPL is developing technologies to deal with “big data” problems from initial data generation through real-time data triage algorithms to large-scale data archiving and mining. Although most current work is focused on the needs of large radio arrays, the technologies involved are widely applicable in other areas.


Publications ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 25
Author(s):  
Brian Jackson

Journal publishers play an important role in the open research data ecosystem. Through open data policies that include public data archiving mandates and data availability statements, journal publishers help promote transparency in research and wider access to a growing scholarly record. The library and information science (LIS) discipline has a unique relationship with both open data initiatives and academic publishing and may be well-positioned to adopt rigorous open data policies. This study examines the information provided on public-facing websites of LIS journals in order to describe the extent, and nature, of open data guidance provided to prospective authors. Open access journals in the discipline have disproportionately adopted detailed, strict open data policies. Commercial publishers, which account for the largest share of publishing in the discipline, have largely adopted weaker policies. Rigorous policies, adopted by a minority of journals, describe the rationale, application, and expectations for open research data, while most journals that provide guidance on the matter use hesitant and vague language. Recommendations are provided for strengthening journal open data policies.


2013 ◽  
Vol 27 (4) ◽  
pp. 1304-1308 ◽  
Author(s):  
Timothy H. Vines ◽  
Rose L. Andrew ◽  
Dan G. Bock ◽  
Michelle T. Franklin ◽  
Kimberly J. Gilbert ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document