metadata quality Latest Research Papers

This paper offers an overview of the highlights of the 2021 NISO Plus Annual Conference that was held virtually from February 22 – February 25, 2021. This was the second NISO Plus annual conference. The first one was held in 2020 and replaced what would have been the 62nd Annual NFAIS conference, but with the merger of NISO and NFAIS in June 2019 the conference was renamed NISO Plus and took on a new format. Little did they know that the second conference would have to be held virtually while the world was battling a global pandemic. The 2021 audience represented a 400% increase over the 2020 in-person attendance. There was no general theme, but there was a topic for everyone working in the information ecosystem - from the practical subjects of standards and metadata quality to preprints to information privacy and ultimately to the impact of Artificial Intelligence/Machine Learning on scholarly communication. With speakers from around the world and across time zones and continents, it was truly a global conversation!

Download Full-text

Institutional Repositories in New Zealand: An Analysis of Coordinators’ Perspectives

10.26686/wgtn.17004760.v1 ◽

2021 ◽

Author(s):

◽

Gregory James Benseman

Keyword(s):

New Zealand ◽

Critical Mass ◽

Research Management ◽

Open Access Publishing ◽

Small Scale ◽

Narrative Development ◽

Institutional Repositories ◽

Qualitative Survey ◽

Metadata Quality ◽

Collection Strategies

<p>This study is a small scale qualitative survey of coordinators working in institutional repository development in New Zealand since critical mass was reached in 2009. It aims to summarise their opinions on the current and future roles of their repository as both a preservation archive, and a discovery resource representing their institution’s research community. The research uses narrative development techniques within the interpretivist paradigm to provide a contextual analysis of the repository’s relationship with other repositories and the National Library. It is supported by quantitative analysis of the sampled repositories’ holdings and the metadata quality with which the holdings are endowed. The analysis finds that since the establishment of New Zealand repositories, coordinators have adapted their collection strategies to encourage depositors towards Open Access publishing. These findings are placed in the context of the growth of non-mandated repository holdings and the technical infrastructure for harvesting resources, and integrating workflows with university research management systems. The results are used to discuss the goals coordinators have for improving the efficiency and visibility of their repository.</p>

Download Full-text

Institutional Repositories in New Zealand: An Analysis of Coordinators’ Perspectives

10.26686/wgtn.17004760 ◽

2021 ◽

Author(s):

◽

Gregory James Benseman

Keyword(s):

New Zealand ◽

Critical Mass ◽

Research Management ◽

Open Access Publishing ◽

Small Scale ◽

Narrative Development ◽

Institutional Repositories ◽

Qualitative Survey ◽

Metadata Quality ◽

Collection Strategies

<p>This study is a small scale qualitative survey of coordinators working in institutional repository development in New Zealand since critical mass was reached in 2009. It aims to summarise their opinions on the current and future roles of their repository as both a preservation archive, and a discovery resource representing their institution’s research community. The research uses narrative development techniques within the interpretivist paradigm to provide a contextual analysis of the repository’s relationship with other repositories and the National Library. It is supported by quantitative analysis of the sampled repositories’ holdings and the metadata quality with which the holdings are endowed. The analysis finds that since the establishment of New Zealand repositories, coordinators have adapted their collection strategies to encourage depositors towards Open Access publishing. These findings are placed in the context of the growth of non-mandated repository holdings and the technical infrastructure for harvesting resources, and integrating workflows with university research management systems. The results are used to discuss the goals coordinators have for improving the efficiency and visibility of their repository.</p>

Download Full-text

Analysis-Ready Data from Hyperspectral Sensors—The Design of the EnMAP CARD4L-SR Data Product

Remote Sensing ◽

10.3390/rs13224536 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4536

Author(s):

Martin Bachmann ◽

Kevin Alonso ◽

Emiliano Carmona ◽

Birgit Gerasch ◽

Martin Habermeyer ◽

...

Keyword(s):

Atmospheric Correction ◽

Hyperspectral Data ◽

Quality Information ◽

Atmospheric Effects ◽

End User ◽

Operational Approach ◽

Hyperspectral Sensors ◽

Metadata Quality ◽

Global Mapping ◽

Processing Steps

Today, the ground segments of the Landsat and Sentinel missions provide a wealth of well-calibrated, characterized datasets which are already orthorectified and corrected for atmospheric effects. Initiatives such as the CEOS Analysis Ready Data (ARD) propose and ensure guidelines and requirements so that such datasets can readily be used, and interoperability within and between missions is a given. With the increasing availability of data from operational and research-oriented spaceborne hyperspectral sensors such as EnMAP, DESIS and PRISMA, and in preparation for the upcoming global mapping missions CHIME and SBG, the provision of analysis ready hyperspectral data will also be of increasing interest. Within this article, the design of the EnMAP Level 2A Land product is illustrated, highlighting the necessary processing steps for CEOS Analysis Ready Data for Land (CARD4L) compliant data products. This includes an overview of the design of the metadata, quality layers and archiving workflows, the necessary processing chain (system correction, orthorectification and atmospheric correction), as well as the resulting challenges of this procedure. Thanks to this operational approach, the end user will be provided with ARD products including rich metadata and quality information, which can readily be integrated in analysis workflows, and combined with data from other sensors.

Download Full-text

Open Government Data: Usage trends and metadata quality

Journal of Information Science ◽

10.1177/01655515211027775 ◽

2021 ◽

pp. 016555152110277

Author(s):

Alfonso Quarati

Keyword(s):

Open Government ◽

Data Sets ◽

Factors Affecting ◽

Data Usage ◽

Economic Progress ◽

Open Government Data ◽

Metadata Quality ◽

Government Data ◽

The Relationship

Open Government Data (OGD) have the potential to support social and economic progress. However, this potential can be frustrated if these data remain unused. Although the literature suggests that OGD data sets’ metadata quality is one of the main factors affecting their use, to the best of our knowledge, no quantitative study provided evidence of this relationship. Considering about 400,000 data sets of 28 national, municipal and international OGD portals, we have programmatically analysed their usage, their metadata quality and the relationship between the two. Our analysis has highlighted three main findings. First, regardless of their size, the software platform adopted, and their administrative and territorial coverage, most OGD data sets are underutilised. Second, OGD portals pay varying attention to the quality of their data sets’ metadata. Third, we did not find clear evidence that data sets’ usage is positively correlated to better metadata publishing practices. Finally, we have considered other factors, such as data sets’ category, and some demographic characteristics of the OGD portals, and analysed their relationship with data sets’ usage, obtaining partially affirmative answers.

Download Full-text

DIANA-miTED: a microRNA tissue expression database

Nucleic Acids Research ◽

10.1093/nar/gkab733 ◽

2021 ◽

Author(s):

Ioannis Kavakiotis ◽

Athanasios Alexiou ◽

Spyros Tastsoglou ◽

Ioannis S Vlachos ◽

Artemis G Hatzigeorgiou

Keyword(s):

Mirna Expression ◽

Cell Types ◽

Tissue Expression ◽

The Cancer Genome Atlas ◽

Rna Seq ◽

Starting Point ◽

Metadata Quality ◽

Cancer Genome Atlas ◽

Downstream Analysis ◽

Systematic Collection

Abstract microRNAs (miRNAs) are short (∼23nt) single-stranded non-coding RNAs that act as potent post-transcriptional gene expression regulators. Information about miRNA expression and distribution across cell types and tissues is crucial to the understanding of their function and for their translational use as biomarkers or therapeutic targets. DIANA-miTED is the most comprehensive and systematic collection of miRNA expression values derived from the analysis of 15 183 raw human small RNA-Seq (sRNA-Seq) datasets from the Sequence Read Archive (SRA) and The Cancer Genome Atlas (TCGA). Metadata quality maximizes the utility of expression atlases, therefore we manually curated SRA and TCGA-derived information to deliver a comprehensive and standardized set, incorporating in total 199 tissues, 82 anatomical sublocations, 267 cell lines and 261 diseases. miTED offers rich instant visualizations of the expression and sample distributions of requested data across variables, as well as study-wide diagrams and graphs enabling efficient content exploration. Queries also generate links towards state-of-the-art miRNA functional resources, deeming miTED an ideal starting point for expression retrieval, exploration, comparison, and downstream analysis, without requiring bioinformatics support or expertise. DIANA-miTED is freely available at http://www.microrna.gr/mited.

Download Full-text

Investigating the use of metadata record graphs to analyze subject headings in the digital public library of America

The Electronic Library ◽

10.1108/el-11-2020-0317 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Mark Edward Phillips ◽

Hannah Tarver

Keyword(s):

Network Analysis ◽

Public Library ◽

Quality Analysis ◽

Common Denominator ◽

Digital Collections ◽

Content Type ◽

Metadata Record ◽

Metadata Quality ◽

Subject Headings ◽

The Individual

Purpose This study furthers metadata quality research by providing complementary network-based metrics and insights to analyze metadata records and identify areas for improvement. Design/methodology/approach Metadata record graphs apply network analysis to metadata field values; this study evaluates the interconnectedness of subjects within each Hub aggregated into the Digital Public Library of America. It also reviews the effects of NACO normalization – simulating revision of values for consistency – and breaking up pre-coordinated subject headings – to simulate applying the Faceted Application of Subject Terminology to Library of Congress Subject Headings. Findings Network statistics complement count- or value-based metrics by providing context related to the number of records a user might actually find starting from one item and moving to others via shared subject values. Additionally, connectivity increases through the normalization of values to correct or adjust for formatting differences or by breaking pre-coordinated subject strings into separate topics. Research limitations/implications This analysis focuses on exact-string matches, which is the lowest-common denominator for searching, although many search engines and digital library indexes may use less stringent matching methods. In terms of practical implications for evaluating or improving subjects in metadata, the normalization components demonstrate where resources may be most effectively allocated for these activities (depending on a collection). Originality/value Although the individual components of this research are not particularly novel, network analysis has not generally been applied to metadata analysis. This research furthers previous studies related to metadata quality analysis of aggregations and digital collections in general.

Download Full-text

A RCT study of DataCite metadata completeness to quantify the benefits of metadata quality on dataset uptake and sharing

10.31222/osf.io/xv7tk ◽

2021 ◽

Author(s):

Christopher I Hunter ◽

Scott C Edmunds ◽

SiZhe Xiao

Keyword(s):

Randomised Controlled Trials ◽

Lessons Learned ◽

Controlled Study ◽

Proof Of Concept ◽

Trial Period ◽

Randomised Controlled Study ◽

Significant Difference ◽

Metadata Quality ◽

Negative Findings ◽

Randomised Controlled

Properly describing and documenting data via rich metadata allows users to understand and track important details of the work. It is thought that richer metadata fuels discovery and innovation, increasing the discoverability and reusability of research, and eliminating duplication of effort. Researchers are told to expend effort to improve their metadata, but efforts to quantify the benefits of putting in this effort have not been carried out. With that in mind we have carried out a randomised controlled study of DataCite metadata completeness on dataset uptake and sharing. 1093 datasets were randomised and given rich vs minimal metadata, and the effect of re-use and sharing of this data was followed over a 1-year period. Analysing this data after the trial period, based on the very low rate of page-views there was no significant difference detected between the two different RCT groups. Despite the negative findings we would like to share this proof of concept and provide lessons learned for how randomised controlled trials should be carried out to quantify the benefits of FAIR data sharing.

Download Full-text

metadata quality
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Paper: Metadata Quality in Time of Diverse Research Outputs

Metadata Quality in the Era of Big Data and Unstructured Content

An Overview of the 2021 NISO Plus Conference: Global connections and global conversations

Institutional Repositories in New Zealand: An Analysis of Coordinators’ Perspectives

Institutional Repositories in New Zealand: An Analysis of Coordinators’ Perspectives

Analysis-Ready Data from Hyperspectral Sensors—The Design of the EnMAP CARD4L-SR Data Product

Open Government Data: Usage trends and metadata quality

DIANA-miTED: a microRNA tissue expression database

Investigating the use of metadata record graphs to analyze subject headings in the digital public library of America

A RCT study of DataCite metadata completeness to quantify the benefits of metadata quality on dataset uptake and sharing

Export Citation Format

metadata qualityRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Paper: Metadata Quality in Time of Diverse Research Outputs

Metadata Quality in the Era of Big Data and Unstructured Content

An Overview of the 2021 NISO Plus Conference: Global connections and global conversations

Institutional Repositories in New Zealand: An Analysis of Coordinators’ Perspectives

Institutional Repositories in New Zealand: An Analysis of Coordinators’ Perspectives

Analysis-Ready Data from Hyperspectral Sensors—The Design of the EnMAP CARD4L-SR Data Product

Open Government Data: Usage trends and metadata quality

DIANA-miTED: a microRNA tissue expression database

Investigating the use of metadata record graphs to analyze subject headings in the digital public library of America

A RCT study of DataCite metadata completeness to quantify the benefits of metadata quality on dataset uptake and sharing

metadata quality
Recently Published Documents