scholarly journals A Content Analysis of Indian Research Data Repositories Prospects and Possibilities

2019 ◽  
Vol 39 (06) ◽  
pp. 280-289 ◽  
Author(s):  
Raj Kumar Bhardwaj

The study aims to trace the development of Indian research data repositories (RDRs) and explore their content with the view of identifying prospects and possibilities. Further, it analyses the distribution of data repositories on the basis of content coverage, types of content, author identification system followed, software and the application programming interface used, subject wise number of repositories etc. The study is based on data repositories listed on the registry of data repositories accessible at http://www.re3data.org.The dataset was exported in Microsoft Excel format for analysis. A simple percentage method was followed in data analyses and results are presented through Tables and Figures. The study found a total of 2829 data repositories in existence worldwide. Further, it was seen that 1526 (53.9 %) are open and 924 (32.4 %) are restricted data repositories. Also, there are embargoed data repositories numbering 225 (8.0 %) and closed ones numbering 154 (5.4 %). There are 2829 RDRs covering 72 countries in the world. The study found that out of total 45 Indian RDRs, only 30 (67 %) are open, followed by restricted 12 (27 %) and 3 (6 %) that are closed. Majority of Indian RDRs (20) were developed in the year 2014. The study found that the majority of Indian RDRs (17) are‘disciplinary’. Further, the study also revealed that statistical data formats are available in a maximum of 31 (68.9 %) Indian RDRs. It was also seen that the majority of Indian RDRs (28) has datasets relating to ‘Life Sciences’. It was identified that only 20% of data repositories have been using metadata standards in metadata; the remaining 80% do not use any standards in metadata entry. This study covered only the research data repositories in India registered on the registry of data repositories. RDRs not listed in the registry of data repositories are left out.

2021 ◽  
Vol 2069 (1) ◽  
pp. 012135
Author(s):  
N D Svane ◽  
A Pranskunas ◽  
L B Lindgren ◽  
R L Jensen

Abstract The architecture, engineering, and construction (AEC) industry experiences a growing need for building performance simulations (BPS) as facilitators in the design process. However, inconsistent modelling practice and varying quality of export/import functions entail error-prone interoperability with IFC and gbXML data formats. Consequently, repeated manual modelling is still necessary. This paper presents a coupling module enabling a semi-automated extract of geometry data from the BIM software Revit and a further translation to a BPS input file using Revit Application Programming Interface (API) and visual programming in Dynamo. The module is tested with three test cases which shows promising results for fast and structured semi-automatic geometry modelling designed to fit today’s practice.


2019 ◽  
Vol 52 (3) ◽  
pp. 633-646 ◽  
Author(s):  
Soohyung Joo ◽  
Christie Peters

This study assesses the needs of researchers for data-related assistance and investigates their research data management behavior. A survey was conducted, and 186 valid responses were collected from faculty, researchers, and graduate students across different disciplines at a research university. The services for which researchers perceive the greatest need include assistance with quantitative analysis and data visualization. Overall, the need for data-related assistance is relatively higher among health scientists, while humanities researchers demonstrate the lowest need. This study also investigated the data formats used, data documentation and storage practices, and data-sharing behavior of researchers. We found that researchers rarely use metadata standards, but rely more on a standard file-naming scheme. As to data sharing, respondents are likely to share their data personally upon request or as supplementary materials to journal publications. The findings of this study will be useful for planning user-centered research data services in academic libraries.


2022 ◽  
Vol 29 (1) ◽  
pp. 91-101
Author(s):  
Gustavo Caetano Borges ◽  
Julio Cesar Dos Reis ◽  
Claudia Bauzer Medeiros

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in the agriculture domain.


2021 ◽  
Author(s):  
Gustavo Caetano Borges ◽  
Julio César dos Reis ◽  
Claudia Bauzer Medeiros

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in agriculture.


2019 ◽  
Author(s):  
Weize Xu ◽  
Da Lin ◽  
Ping Hong ◽  
Liang Yi ◽  
Rohit Tyagi ◽  
...  

AbstractSummaryCoolBox is a Python package for interactive genomic data exploration based on Jupyter notebook. It provides a ggplot2-like Application Programming Interface (API) for genomic data visualization, and a Jupyter/ipywidgets based Graphical User Interface (GUI) for interactive data exploration. CoolBox is a versatile multi-omics explorer supporting most types of data formats generated by various sequencing technologies like RNA-Seq, ChIP-Seq, ChIA-PET and Hi-C.Availability and implementationCoolBox is purely implemented with Python, and the GUI widget in Jupyter notebook is based on the ipywidgets package. It is open-source and available under GPLv3 license at https://github.com/GangCaoLab/CoolBox.


2020 ◽  
Vol 15 (1) ◽  
pp. 16
Author(s):  
Joakim Philipson

One of the grand curation challenges is to secure metadata quality in the ever-changing environment of metadata standards and file formats. As the Red Queen tells Alice in Through the Looking-Glass: “Now, here, you see, it takes all the running you can do, to keep in the same place.” That is, there is some “running” needed to keep metadata records in a research data repository fit for long-term use and put in place. One of the main tools of adaptation and keeping pace with the evolution of new standards, formats – and versions of standards in this ever-changing environment are validation schemas. Validation schemas are mainly seen as methods of checking data quality and fitness for use, but are also important for long-term preservation. We might like to think that our present (meta)data standards and formats are made for eternity, but in reality we know that standards evolve, formats change (some even become obsolete with time), and so do our needs for storage, searching and future dissemination for re-use. Eventually, we come to a point where transformation of our archival records and migration to other formats will be necessary. This could also mean that even if the AIPs, the Archival Information Packages stay the same in storage, the DIPs, the Dissemination Information Packages that we want to extract from the archive are subject to change of format. Further, in order for archival information packages to be self-sustainable, as required in the OAIS model, it is important to take interdependencies between individual files in the information packages into account. This should be done already by the time of ingest and validation of the SIPs, the Submission Information Packages, and along the line at different points of necessary transformation/migration (from SIP to AIP, from AIP to DIP etc.), in order to counter obsolescence. This paper investigates possible validation errors and missing elements in metadata records from three general purpose, multidisciplinary research data repositories – Figshare, Harvard’s Dataverse and Zenodo, and explores the potential effects of these errors on future transformation to AIPs and migration to other formats within a digital archive.  


Author(s):  
Natalia S. Redkina

Library specialists having competencies in the field of modern information technologies and knowledge of information resources, capable to analyse and synthesize heterogeneous information, process data, solve non-standard tasks, are able to develop innovative trends, increase the importance and competitiveness of libraries in the information space. The purpose of this study is to determine the most important skills and knowledge of librarians for the development of new forms and trends in the activities of research libraries: assistant services to scientists, work with research data, creation of intellectual centres, centres of intellectual leisure, organization of communication platforms, etc. The author highlights the key knowledge necessary for librarian: knowledge of modern and advanced information technologies (social networks, cloud, mobile technologies, new generation analytics, etc.), knowledge of the world market of information resources, as well as technologies of collection and processing of information/data. The article presents competences of librarians in the research data management, who provide consulting and assistant services to scientists in the life cycle of research. It is determined that the research data management librarian should know the methods of data management plan preparation, management methods, categories, metadata standards and schemes, data classifications and identifiers, data citation requirements, copyright, data repositories, long-term data preservation technologies, etc. The author concludes that the possession of non-specialized over-professional (“soft”) skills (communication skills, emotional intelligence, thinking by “results” and “processes”, etc.) along with the complex of professional knowledge is the key to the improvement of efficiency and demand of libraries in the conditions of intensively developing environment.


2014 ◽  
Vol 7 (6) ◽  
pp. 3135-3151 ◽  
Author(s):  
M. Bavay ◽  
T. Egger

Abstract. Using numerical models which require large meteorological data sets is sometimes difficult and problems can often be traced back to the Input/Output functionality. Complex models are usually developed by the environmental sciences community with a focus on the core modelling issues. As a consequence, the I/O routines that are costly to properly implement are often error-prone, lacking flexibility and robustness. With the increasing use of such models in operational applications, this situation ceases to be simply uncomfortable and becomes a major issue. The MeteoIO library has been designed for the specific needs of numerical models that require meteorological data. The whole task of data preprocessing has been delegated to this library, namely retrieving, filtering and resampling the data if necessary as well as providing spatial interpolations and parameterizations. The focus has been to design an Application Programming Interface (API) that (i) provides a uniform interface to meteorological data in the models, (ii) hides the complexity of the processing taking place, and (iii) guarantees a robust behaviour in the case of format errors, erroneous or missing data. Moreover, in an operational context, this error handling should avoid unnecessary interruptions in the simulation process. A strong emphasis has been put on simplicity and modularity in order to make it extremely easy to support new data formats or protocols and to allow contributors with diverse backgrounds to participate. This library is also regularly evaluated for computing performance and further optimized where necessary. Finally, it is released under an Open Source license and is available at http://models.slf.ch/p/meteoio. This paper gives an overview of the MeteoIO library from the point of view of conceptual design, architecture, features and computational performance. A scientific evaluation of the produced results is not given here since the scientific algorithms that are used have already been published elsewhere.


2021 ◽  
Vol 45 (3-4) ◽  
Author(s):  
Gilbert Mushi

The emergence of data-driven research and demands for the establishment of Research Data Management (RDM) has created interest in academic institutions and research organizations globally. Some of the libraries especially in developed countries have started offering RDM services to their communities. Although lagging behind, some academic libraries in developing countries are at the stage of planning or implementing the service. However, the level of RDM awareness is very low among researchers, librarians and other data practitioners. The objective of this paper is to present available open resources for different data practitioners particularly researchers and librarians. It includes training resources for both researchers and librarians, Data Management Plan (DMP) tool for researchers; data repositories available for researchers to freely archive and share their research data to the local and international communities.   A case study with a survey was conducted at the University of Dodoma to identify relevant RDM services so that librarians could assist researchers to make their data accessible to the local and international community. The study findings revealed a low level of RDM awareness among researchers and librarians. Over 50% of the respondent indicated their perceived knowledge as poor in the following RDM knowledge areas; DMP, data repository, long term digital preservation, funders RDM mandates, metadata standards describing data and general awareness of RDM. Therefore, this paper presents available open resources for different data practitioners to improve RDM knowledge and boost the confidence of academic and research libraries in establishing the service.


Sign in / Sign up

Export Citation Format

Share Document