scholarly journals Padrões de metadados para representação e organização da informação em repositórios de dados de pesquisa

2019 ◽  
Vol 5 (1) ◽  
pp. 37-51
Author(s):  
Fernanda Alves Sanchez ◽  
Nathália Britto Pinheiro Da Silva ◽  
Fernando Luiz Vechiato

Os padrões de metadados possibilitam que os dados de pesquisas possam ser descritos, obtendo informações de sua proveniência. Objetivou-se identificar os padrões de metadados mais utilizados mundialmente para a representação de dados de pesquisa. A pesquisa documental e exploratória de abordagem qualitativa, utilizou como instrumento metodológico o diretório Registry of Research Data Repositories - Re3data, selecionando os três padrões de metadados mais utilizados pelos repositórios de dados de pesquisa, sendo eles: Dublin Core (DC), Data Documentation Initiative (DDI) e ISO 19115 - Geografic information - Metadada., Data. O diretório contribuiu ainda para uma escolha de três repositórios que fazem uso dos padrões de metadados. Verificou-se que os padrões de metadados representam seus dados e as informações, de modo que auxiliam na veracidade das informações sobre um determinado dado de pesquisa representado, bem como permite uma descrição, assim tornando-se no formato de dados e informações armazenadas nos repositórios de dados de pesquisa que potencializam a uso, reuso e compartilhamento.

2017 ◽  
Vol 12 (1) ◽  
pp. 88-105 ◽  
Author(s):  
Sünje Dallmeier-Tiessen ◽  
Varsha Khodiyar ◽  
Fiona Murphy ◽  
Amy Nurnberger ◽  
Lisa Raymond ◽  
...  

The data curation community has long encouraged researchers to document collected research data during active stages of the research workflow, to provide robust metadata earlier, and support research data publication and preservation. Data documentation with robust metadata is one of a number of steps in effective data publication. Data publication is the process of making digital research objects ‘FAIR’, i.e. findable, accessible, interoperable, and reusable; attributes increasingly expected by research communities, funders and society. Research data publishing workflows are the means to that end. Currently, however, much published research data remains inconsistently and inadequately documented by researchers. Documentation of data closer in time to data collection would help mitigate the high cost that repositories associate with the ingest process. More effective data publication and sharing should in principle result from early interactions between researchers and their selected data repository. This paper describes a short study undertaken by members of the Research Data Alliance (RDA) and World Data System (WDS) working group on Publishing Data Workflows. We present a collection of recent examples of data publication workflows that connect data repositories and publishing platforms with research activity ‘upstream’ of the ingest process. We re-articulate previous recommendations of the working group, to account for the varied upstream service components and platforms that support the flow of contextual and provenance information downstream. These workflows should be open and loosely coupled to support interoperability, including with preservation and publication environments. Our recommendations aim to stimulate further work on researchers’ views of data publishing and the extent to which available services and infrastructure facilitate the publication of FAIR data. We also aim to stimulate further dialogue about, and definition of, the roles and responsibilities of research data services and platform providers for the ‘FAIRness’ of research data publication workflows themselves.


2021 ◽  
Vol 16 (3) ◽  
pp. 2-17
Author(s):  
Shawn W. Nicholson ◽  
Terrence B. Bennett

Objective – This study uses quantitative methods to determine if the metadata requirements of institutional repositories (IRs) promote data discovery. This question is addressed through an exploration of an international sample of university IRs, including an analysis of the required metadata elements for data deposit, with a particular focus on how these metadata support discovery of research data objects. Methods – The researchers worked with an international universe of 243 IRs. A codebook of 10 variables was developed to enable analysis of the eventual randomly derived sample of 40 institutions. Results – The analysis of our sample IRs revealed that most had metadata standards that offered weak support for data discovery—an unsurprising revelation in view of the fact that university IRs are meant to accommodate deposit and storage of all types of scholarly outputs, only a small percentage of which are research data objects. Most IRs seem to have adopted metadata standards based on the Dublin Core schema, while none of the IRs in our sample used the Data Documentation Initiative metadata that is better suited for deposit and discovery of research datasets. Conclusion – The study demonstrates that while data deposit can be accommodated by the existing metadata requirements of multi-purpose IRs, their metadata practices do little to prioritize data deposit or to promote data discovery. Evidence indicates that data discovery will benefit from additional metadata elements.


2021 ◽  
pp. 016555152199863
Author(s):  
Ismael Vázquez ◽  
María Novo-Lourés ◽  
Reyes Pavón ◽  
Rosalía Laza ◽  
José Ramón Méndez ◽  
...  

Current research has evolved in such a way scientists must not only adequately describe the algorithms they introduce and the results of their application, but also ensure the possibility of reproducing the results and comparing them with those obtained through other approximations. In this context, public data sets (sometimes shared through repositories) are one of the most important elements for the development of experimental protocols and test benches. This study has analysed a significant number of CS/ML ( Computer Science/ Machine Learning) research data repositories and data sets and detected some limitations that hamper their utility. Particularly, we identify and discuss the following demanding functionalities for repositories: (1) building customised data sets for specific research tasks, (2) facilitating the comparison of different techniques using dissimilar pre-processing methods, (3) ensuring the availability of software applications to reproduce the pre-processing steps without using the repository functionalities and (4) providing protection mechanisms for licencing issues and user rights. To show the introduced functionality, we created STRep (Spam Text Repository) web application which implements our recommendations adapted to the field of spam text repositories. In addition, we launched an instance of STRep in the URL https://rdata.4spam.group to facilitate understanding of this study.


Author(s):  
Johannes Hubert Stigler ◽  
Elisabeth Steiner

Research data repositories and data centres are becoming more and more important as infrastructures in academic research. The article introduces the Humanities’ research data repository GAMS, starting with the system architecture to preservation policy and content policy. Challenges of data centres and repositories and the general and domain-specific approaches and solutions are outlined. Special emphasis lies on the sustainability and long-term perspective of such infrastructures, not only on the technical but above all on the organisational and financial level.


2019 ◽  
Author(s):  
Elizabete Cristina de Souza de Aguiar Monteiro ◽  
Priscila Machado Borges Sena ◽  
Ricardo César Gonçalves Sant’Ana ◽  
Ursula Blattmann
Keyword(s):  

Os Repositórios de dados científicos de universidades tem infraestrutura para dar suporte aos pesquisadores na gestão e na disponibilização de dados potencializando sua reutilização por outros pesquisadores. Os dados armazenados em repositórios podem contribuir para o resgate da memória de uma instituição, uma vez que são organizados e representados de forma a revelar os métodos e os instrumentos utilizados pelos pesquisadores em determinados períodos de tempo, bem como as temáticas pesquisadas, os tipos de dados coletados ou gerados e o contexto histórico que fizeram parte. A organização do conhecimento pode ser compreendida como um procedimento de modelagem do conhecimento que objetiva a elaboração de representações do conhecimento. Nessa perspectiva, é possível relacionar a organização e representação à constituição de uma dada memória. Ao abordar memória torna-se relevante ressaltar que esta pode ser individual ou coletiva. Deste modo, compreende-se que o resgate dos dados revela o registro da memória individual de um pesquisador em relação a sua pesquisa e, em conjunto com as memórias de outros pesquisadores e da instituição, tornam-se passíveis de constituir uma memória coletiva. Sendo assim, buscou-se apresentar como a organização e representação de dados em repositório de dados pode contribuir para a constituição e recuperação da memória institucional. A metodologia empregada foi exploratória e descritiva. O universo pesquisado foi composto por 36 repositórios recuperados das cem melhores universidades do mundo ranqueadas no webometrics.info. Para coleta dos dados sobre o padrão de metadados utilizados pelos repositórios foi utilizado o Registry of Research Data Repositoy, (re3data.org), um registro global de repositórios de dados de pesquisa. Os resultados demonstram que para a representação dos conjuntos de dados, os repositórios analisados utilizam o esquema de metadados Dublin Core (DC) e alguns repositórios criaram seus próprios requisitos a partir do DC para atender suas particularidades na representação tendo como atributos em comum título, autor, palavras-chave, assunto, versões e descrição dos conjuntos de dados. Os repositórios organizam seus conjuntos de dados em coleções que denominam como: a) comunidades e coleções ou disciplinas: representam as comunidades, os departamentos ou instituto que compõem a universidade, sendo elementos que representam a memória do que cada área coletou ou gerou de dados e as pesquisas desenvolvidas; b) cobertura temporal: cobre o período histórico ao qual os dados estão relacionados e representam as memórias anuais da comunidade; c) cobertura geográfica: incluem os dados de determinadas cidades, países ou regiões, e representam a memória dos locais que fizeram parte das pesquisas; d) financiador: as agências financiadoras das pesquisas e representam a memória ligada às agências que financiaram as pesquisas em determinados período de tempo. Conclui-se que os repositórios de dados são serviços vinculados organicamente aos ambientes institucionais, agregam valor aos repositórios institucionais das universidades tendo o compromisso com a formação da memória acadêmica e institucional e com a preservação a longo prazo de ativos de valor contínuo e que a reunião e organização da memória do pesquisador e da instituição favorece a rastreabilidade e resgate dos elementos que compõem esses repositórios.


2019 ◽  
Vol 5 ◽  
Author(s):  
Matthew Murray ◽  
Megan O'Donnell ◽  
Mark Laufersweiler ◽  
John Novak ◽  
Betty Rozum ◽  
...  

This report shares the results of a Spring 2018 survey of 35 academic libraries in the United States in regard to the research data services (RDS) they offer. An executive summary presents key findings while the results section provides detailed information on the answers to specific survey questions related to data repositories, metadata, workshops, and polices.


Ravnetrykk ◽  
2020 ◽  
Author(s):  
Philipp Conzett

Research data repositories play a crucial role in the FAIR (Findable, Accessible, Interoperable, Reusable) ecosystem of digital objects. DataverseNO is a national, generic repository for open research data, primarily from researchers affiliated with Norwegian research organizations. The repository runs on the open-source software Dataverse. This article presents the organization and operation of DataverseNO, and investigates how the repository contributes to the increased FAIRness of small and medium sized research data. Sections 1 to 3 present background information about the FAIR Data Principles (section 1), how FAIR may be turned into reality (section 2), and what these principles and recommendations imply for data from the so-called long tail of research, i.e. small and medium-sized datasets that are often heterogenous in nature and hard to standardize (section 3). Section 4 gives an overview of the key organizational features of DataverseNO, followed by an evaluation of how well DataverseNO and the repository application Dataverse as such support the FAIR Data Principles (section 5). Section 6 discusses how sustainable and trustworthy the repository is. The article is rounded up in section 7 by a brief summary including a look into the future of the repository.


2019 ◽  
Vol 39 (06) ◽  
pp. 280-289 ◽  
Author(s):  
Raj Kumar Bhardwaj

The study aims to trace the development of Indian research data repositories (RDRs) and explore their content with the view of identifying prospects and possibilities. Further, it analyses the distribution of data repositories on the basis of content coverage, types of content, author identification system followed, software and the application programming interface used, subject wise number of repositories etc. The study is based on data repositories listed on the registry of data repositories accessible at http://www.re3data.org.The dataset was exported in Microsoft Excel format for analysis. A simple percentage method was followed in data analyses and results are presented through Tables and Figures. The study found a total of 2829 data repositories in existence worldwide. Further, it was seen that 1526 (53.9 %) are open and 924 (32.4 %) are restricted data repositories. Also, there are embargoed data repositories numbering 225 (8.0 %) and closed ones numbering 154 (5.4 %). There are 2829 RDRs covering 72 countries in the world. The study found that out of total 45 Indian RDRs, only 30 (67 %) are open, followed by restricted 12 (27 %) and 3 (6 %) that are closed. Majority of Indian RDRs (20) were developed in the year 2014. The study found that the majority of Indian RDRs (17) are‘disciplinary’. Further, the study also revealed that statistical data formats are available in a maximum of 31 (68.9 %) Indian RDRs. It was also seen that the majority of Indian RDRs (28) has datasets relating to ‘Life Sciences’. It was identified that only 20% of data repositories have been using metadata standards in metadata; the remaining 80% do not use any standards in metadata entry. This study covered only the research data repositories in India registered on the registry of data repositories. RDRs not listed in the registry of data repositories are left out.


Sign in / Sign up

Export Citation Format

Share Document