Emerging needs and existing links in distributed ocean science research data sets

The reproducibility challenge – what researchers need

Septentrio Conference Series ◽

10.7557/5.4257 ◽

2017 ◽

Author(s):

Federica Rosetta

Keyword(s):

Scientific Discovery ◽

Open Data ◽

Science Research ◽

Open Science ◽

Research Data ◽

Data Sets ◽

Scientific Practices ◽

Data Citation ◽

Materials Used ◽

High Level

Watch the VIDEO here.Within the Open Science discussions, the current call for “reproducibility” comes from the raising awareness that results as presented in research papers are not as easily reproducible as expected, or even contradicted those original results in some reproduction efforts. In this context, transparency and openness are seen as key components to facilitate good scientific practices, as well as scientific discovery. As a result, many funding agencies now require the deposit of research data sets, institutions improve the training on the application of statistical methods, and journals begin to mandate a high level of detail on the methods and materials used. How can researchers be supported and encouraged to provide that level of transparency? An important component is the underlying research data, which is currently often only partly available within the article. At Elsevier we have therefore been working on journal data guidelines which clearly explain to researchers when and how they are expected to make their research data available. Simultaneously, we have also developed the corresponding infrastructure to make it as easy as possible for researchers to share their data in a way that is appropriate in their field. To ensure researchers get credit for the work they do on managing and sharing data, all our journals support data citation in line with the FORCE11 data citation principles – a key step in the direction of ensuring that we address the lack of credits and incentives which emerged from the Open Data analysis (Open Data - the Researcher Perspective https://www.elsevier.com/about/open-science/research-data/open-data-report ) recently carried out by Elsevier together with CWTS. Finally, the presentation will also touch upon a number of initiatives to ensure the reproducibility of software, protocols and methods. With STAR methods, for instance, methods are submitted in a Structured, Transparent, Accessible Reporting format; this approach promotes rigor and robustness, and makes reporting easier for the author and replication easier for the reader.

Download Full-text

A Framework to represent variables and values in Social Science research data sets to support data curation and reuse

Challenges and Opportunities for Knowledge Organization in the Digital Age ◽

10.5771/9783956504211-231 ◽

2018 ◽

pp. 231-239

Author(s):

Guangyuan Sun ◽

Christopher S. G. Khoo

Keyword(s):

Social Science ◽

Social Science Research ◽

Science Research ◽

Research Data ◽

Data Curation ◽

Data Sets

Download Full-text

Imputation Methods Approach to Clinical and Life Science Research Data Sets

Design of Experiments and Advanced Statistical Techniques in Clinical Research ◽

10.1007/978-981-15-8210-3_11 ◽

2020 ◽

pp. 321-332

Author(s):

Basavarajaiah D. M. ◽

Bhamidipati Narasimha Murthy

Keyword(s):

Life Science ◽

Science Research ◽

Research Data ◽

Data Sets ◽

Life Science Research ◽

Imputation Methods

Download Full-text

Improvements for research data repositories: The case of text spam

Journal of Information Science ◽

10.1177/0165551521998636 ◽

2021 ◽

pp. 016555152199863

Author(s):

Ismael Vázquez ◽

María Novo-Lourés ◽

Reyes Pavón ◽

Rosalía Laza ◽

José Ramón Méndez ◽

...

Keyword(s):

Web Application ◽

Research Data ◽

Data Sets ◽

Data Repositories ◽

Software Applications ◽

Public Data ◽

Protection Mechanisms ◽

Experimental Protocols ◽

Learning Research ◽

Processing Steps

Current research has evolved in such a way scientists must not only adequately describe the algorithms they introduce and the results of their application, but also ensure the possibility of reproducing the results and comparing them with those obtained through other approximations. In this context, public data sets (sometimes shared through repositories) are one of the most important elements for the development of experimental protocols and test benches. This study has analysed a significant number of CS/ML ( Computer Science/ Machine Learning) research data repositories and data sets and detected some limitations that hamper their utility. Particularly, we identify and discuss the following demanding functionalities for repositories: (1) building customised data sets for specific research tasks, (2) facilitating the comparison of different techniques using dissimilar pre-processing methods, (3) ensuring the availability of software applications to reproduce the pre-processing steps without using the repository functionalities and (4) providing protection mechanisms for licencing issues and user rights. To show the introduced functionality, we created STRep (Spam Text Repository) web application which implements our recommendations adapted to the field of spam text repositories. In addition, we launched an instance of STRep in the URL https://rdata.4spam.group to facilitate understanding of this study.

Download Full-text

Measuring the Value of Research Data: A Citation Analysis of Oceanographic Data Sets

PLoS ONE ◽

10.1371/journal.pone.0092590 ◽

2014 ◽

Vol 9 (3) ◽

pp. e92590 ◽

Cited By ~ 33

Author(s):

Christopher W. Belter

Keyword(s):

Citation Analysis ◽

Research Data ◽

Data Sets ◽

Oceanographic Data

Download Full-text

PS2-54: Best Practices: Improving Quality and Reliability in Research Data Sets

Clinical Medicine & Research ◽

10.3121/cmr.2013.1176.ps2-54 ◽

2013 ◽

Vol 11 (3) ◽

pp. 157-157

Author(s):

L. McFarland ◽

J. Richter ◽

C. Bredfeldt

Keyword(s):

Best Practices ◽

Research Data ◽

Data Sets ◽

Quality And Reliability

Download Full-text

Data Cleaning in Occupational Therapy Research

The Occupational Therapy Journal of Research ◽

10.1177/153944929401400302 ◽

1994 ◽

Vol 14 (3) ◽

pp. 144-156

Author(s):

Marcel P. J. M. Dijkers ◽

Cynthia L. Creighton

Keyword(s):

Occupational Therapy ◽

Data Cleaning ◽

Research Data ◽

Data Sets ◽

Processing Data ◽

Research Findings

Errors in processing data prior to analysis can cause significant distortion of research findings. General principles and specific techniques for cleaning data sets are presented. Strategies are suggested for preventing errors in transcribing, coding, and keying research data.

Download Full-text

Applied plant science experimental design and statistical analysis using the SAS® OnDemand for Academics

10.1079/9781789249927.0000 ◽

2021 ◽

Keyword(s):

Statistical Analysis ◽

Agricultural Production ◽

Science Research ◽

Plant Science ◽

Data Sets ◽

Statistical Analysis System ◽

Key Topics ◽

Analysis System ◽

User Friendly ◽

Enormous Number

Abstract The correct design, analysis and interpretation of plant science experiments is imperative for continued improvements in agricultural production worldwide. The enormous number of design and analysis options available for correctly implementing, analyzing and interpreting research can be overwhelming. Statistical Analysis System (SAS®) is the most widely used statistical software in the world and SAS® OnDemand for Academics is now freely available for academic insttutions. This is a user-friendly guide to statistics using SAS® OnDemand for Academics, ideal for facilitating the design and analysis of plant science experiments. It presents the most frequently used statistical methods in an easy-to-follow and non-intimidating fashion, and teaches the appropriate use of SAS® within the context of plant science research. This book contains 21 chapters that covers experimental designs and data analysis protocols; is presented as a how-to guide with many examples; includes freely downloadable data sets; and examines key topics such as ANOVA, mean separation, non-parametric analysis and linear regression.

Download Full-text

Problematizing Digital Research Evaluation using DOIs in Practice-Based Arts, Humanities and Social Science Research

F1000Research ◽

10.12688/f1000research.6506.1 ◽

2015 ◽

Vol 4 ◽

pp. 193 ◽

Cited By ~ 1

Author(s):

Muriel Swijghuisen Reigersberg

Keyword(s):

Social Science ◽

Impact Analysis ◽

Research Evaluation ◽

Social Science Research ◽

Research Integrity ◽

Science Research ◽

Careful Consideration ◽

Research Data ◽

Impact Measurement ◽

Digital Research

This paper explores emerging practices in research data management in the arts, humanities and social sciences (AHSS). It will do so vis-à-vis current citation conventions and impact measurement for research in AHSS. Case study findings on research data inventoried at Goldsmiths’, University of London will be presented. Goldsmiths is a UK research-intensive higher education institution which specialises in arts, humanities and social science research. The paper’s aim is to raise awareness of the subject-specific needs of AHSS scholars to help inform the design of future digital tools for impact analysis in AHSS. Firstly, I shall explore the definition of research data and how it is currently understood by AHSS researchers. I will show why many researchers choose not to engage with digital dissemination techniques and ORCID. This discussion must necessarily include the idea that practice-based and applied AHSS research are processes which are not easily captured in numerical ‘sets’ and cannot be labelled electronically without giving careful consideration to what a group or data item ‘represents’ as part of the academic enquiry, and therefore how it should be cited and analysed as part of any impact assessment. Then, the paper will explore: the role of the monograph and arts catalogue in AHSS scholarship; how citation practices and digital impact measurement in AHSS currently operate in relation to authorship and how digital identifiers may hypothetically impact on metrics, intellectual property (IP), copyright and research integrity issues in AHSS. I will also show that, if we are to be truly interdisciplinary, as research funders and strategic thinkers say we should, it is necessary to revise the way we think about digital research dissemination. This will involve breaking down the boundaries between AHSS and other types of research.

Download Full-text

Detecting Subgroups in Survey Research

Psychological Reports ◽

10.2466/pr0.1986.59.2.751 ◽

1986 ◽

Vol 59 (2) ◽

pp. 751-760

Author(s):

Todd McLin Davis

Keyword(s):

Cluster Analysis ◽

Survey Research ◽

Potential Interaction ◽

Research Data ◽

The Other ◽

Data Sets ◽

Statistical Procedures

A problem often not detected in the interpretation of survey research is the potential interaction between subgroups within the sample and aspects of the survey. Potentially interesting interactions are commonly obscured when data are analyzed using descriptive and univariate statistical procedures. This paper suggests the use of cluster analysis as a tool for interpretation of data, particularly when such data take the form of coded categories. An example of the analysis of two data sets with known properties, one random and the other contrived, is presented to illustrate the application of cluster procedures to survey research data.

Download Full-text