metadata generation Latest Research Papers

Metadata are key descriptors of research data, particularly for researchers seeking to apply machine learning (ML) to the vast collections of digitized specimens. Unfortunately, the available metadata is often sparse and, at times, erroneous. Additionally, it is prohibitively expensive to address these limitations through traditional, manual means. This paper reports on research that applies machine-driven approaches to analyzing digitized fish images and extracting various important features from them. The digitized fish specimens are being analyzed as part of the Biology Guided Neural Networks (BGNN) initiative, which is developing a novel class of artificial neural networks using phylogenies and anatomy ontologies. Automatically generated metadata is crucial for identifying the high-quality images needed for the neural network's predictive analytics. Methods that combine ML and image informatics techniques allow us to rapidly enrich the existing metadata associated with the 7,244 images from the Illinois Natural History Survey (INHS) used in our study. Results show we can accurately generate many key metadata properties relevant to the BGNN project, as well as general image quality metrics (e.g. brightness and contrast). Results also show that we can accurately generate bounding boxes and segmentation masks for fish, which are needed for subsequent machine learning analyses. The automatic process outperforms humans in terms of time and accuracy, and provides a novel solution for leveraging digitized specimens in ML. This research demonstrates the ability of computational methods to enhance the digital library services associated with the tens of thousands of digitized specimens stored in open-access repositories worldwide.

Download Full-text

Automatic Metadata Generation for Fish Specimen Image Collections

10.1109/jcdl52503.2021.00015 ◽

2021 ◽

Author(s):

Joel Pepper ◽

Jane Greenberg ◽

Yasin Bakis ◽

Xiaojun Wang ◽

Henry Bart ◽

...

Keyword(s):

Metadata Generation

Download Full-text

Scaling up Cast Face Detection in Videos at Globo

10.5753/semish.2021.15816 ◽

2021 ◽

Author(s):

Felipe A. Ferreira ◽

Bruno P. Oliveira ◽

Rodrigo V. Kassick ◽

Vinícius Furlan ◽

Hélio Lopes

Keyword(s):

Recommender Systems ◽

Face Detection ◽

Search Engines ◽

Scaling Up ◽

Video Stream ◽

Video Content ◽

Industry Sector ◽

Production And Consumption ◽

Metadata Generation

It has been recognized that a significant increase in the production and consumption of video content occurred in the last decade. Many entertainment companies, like Globo, face challenges regarding video metadata generation. The objective of this paper is to present a suitable architecture for the Globo Group to automatically identify actors that appear in each scene of a video stream, generating new metadata annotations that can be used by recommender systems and search engines among different other applications in this industry sector.

Download Full-text

Modelling the Microphone-Related Timbral Brightness of Recorded Signals

Applied Sciences ◽

10.3390/app11146461 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6461

Author(s):

Andy Pearce ◽

Tim Brookes ◽

Russell Mason

Keyword(s):

Linear Correlation ◽

Training Data ◽

Validation Data ◽

Spectral Centroid ◽

Choice Making ◽

Recorded Sound ◽

Total Magnitude ◽

Audio Databases ◽

Good Linear Correlation ◽

Metadata Generation

Brightness is one of the most common timbral descriptors used for searching audio databases, and is also the timbral attribute of recorded sound that is most affected by microphone choice, making a brightness prediction model desirable for automatic metadata generation. A model, sensitive to microphone-related as well as source-related brightness, was developed based on a novel combination of the spectral centroid and the ratio of the total magnitude of the signal above 500 Hz to that of the full signal. This model performed well on training data (r = 0.922). Validating it on new data showed a slight gradient error but good linear correlation across source types and overall (r = 0.955). On both training and validation data, the new model out-performed metrics previously used for brightness prediction.

Download Full-text

An Automated Metadata Generation Method for Data Lake of Industrial WoT Applications

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2021.3119871 ◽

2021 ◽

pp. 1-14

Author(s):

Han Yu ◽

Hongming Cai ◽

Zhiyuan Liu ◽

Boyi Xu ◽

Lihong Jiang

Keyword(s):

Metadata Generation

Download Full-text

Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results

Information Technology and Libraries ◽

10.6017/ital.v39i3.12235 ◽

2020 ◽

Vol 39 (3) ◽

Author(s):

Sam Grabus

Keyword(s):

Comparative Study ◽

18Th Century ◽

Test Environment ◽

The Third ◽

Before And After ◽

Subject Indexing ◽

The Impact ◽

Metadata Generation

This research compares automatic subject metadata generation when the pre-1800s Long-S character is corrected to a standard < s >. The test environment includes entries from the third edition of the Encyclopedia Britannica, and the HIVE automatic subject indexing tool. A comparative study of metadata generated before and after correction of the Long-S demonstrated an average of 26.51 percent potentially relevant terms per entry omitted from results if the Long-S is not corrected. Results confirm that correcting the Long-S increases the availability of terms that can be used for creating quality metadata records. A relationship is also demonstrated between shorter entries and an increase in omitted terms when the Long-S is not corrected.

Download Full-text

Automatic RDF, Metadata Generation from Legacy Software Models

Communications in Computer and Information Science - Intelligent Technologies and Applications ◽

10.1007/978-981-15-5232-8_33 ◽

2020 ◽

pp. 385-397

Author(s):

Amna Riaz ◽

Imran Sarwar Bajwa ◽

Munsub Ali

Keyword(s):

Legacy Software ◽

Software Models ◽

Metadata Generation

Download Full-text

The Future of International Statistical Data Sharing and New Issues of Interaction

Voprosy statistiki ◽

10.34023/2313-6383-2019-26-7-55-66 ◽

2019 ◽

Vol 26 (7) ◽

pp. 55-66

Author(s):

O. E. Bashina ◽

N. A. Komkova ◽

L. V. Matraeva ◽

V. E. Kosolapova

Keyword(s):

Information Technology ◽

International Organizations ◽

Continuous Monitoring ◽

Statistical Data ◽

National Level ◽

International Level ◽

It Infrastructure ◽

General Statistical ◽

Practical Recommendations ◽

Metadata Generation

The article deals with challenges and prospects of implementation of the Statistical Data and Metadata eXchange (SDMX) standard and using it in the international sharing of statistical data and metadata. The authors identified potential areas where this standard can be used, described a mechanism for data and metadata sharing according to SDMX standard. Major issues classified into three groups - general, statistical, information technology - were outlined by applying both domestic and foreign experience of implementation of the standard. These issues may arise at the national level (if the standard is implemented domestically), at the international level (when the standard is applied by international organizations), and at the national-international level (if the information is exchanged between national statistical data providers and international organizations). General issues arise at the regulatory level and are associated with establishing boundaries of responsibility of counterpart organizations at all three levels of interaction, as well as in terms of increasing the capacity to apply the SDMX standard. Issues of statistical nature are most often encountered due to the sharing of large amounts of data and metadata related to various thematic areas of statistics; there should be a unified structure of data and metadata generation and transmission. With the development of information sharing, arise challenges and issues associated with continuous monitoring and expanding SDMX code lists. At the same time, there is a lack of a universal data structure at the international level and, as a result, it is difficult to understand and apply at the national level the existing data structures developed by international organizations. Challenges of information technology are related to creating an IT infrastructure for data and metadata sharing using the SDMX standard. The IT infrastructure (depending on the participant status) includes the following elements: tools for the receiving organizations, tools for sending organization and the infrastructure for the IT professionals. For each of the outlined issues, the authors formulated some practical recommendations based on the complexity principle as applied to the implementation of the international SDMX standard for the exchange of data and metadata.

Download Full-text

Computational linguistic prosody rule-based unified technique for automatic metadata generation for Hindi poetry

2019 1st International Conference on Advances in Information Technology (ICAIT) ◽

10.1109/icait47043.2019.8987239 ◽

2019 ◽

Cited By ~ 1

Author(s):

Milind Kumar Audichya ◽

Jatinderkumar R. Saini

Keyword(s):

Computational Linguistic ◽

Rule Based ◽

Metadata Generation

Download Full-text

metadata generation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Development of a Web Browser-based Character in Video Metadata Generation Tool

Automatic Metadata Generation for Fish Specimen Image Collections

Automatic Metadata Generation for Fish Specimen Image Collections

Scaling up Cast Face Detection in Videos at Globo

Modelling the Microphone-Related Timbral Brightness of Recorded Signals

An Automated Metadata Generation Method for Data Lake of Industrial WoT Applications

Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results

Automatic RDF, Metadata Generation from Legacy Software Models

The Future of International Statistical Data Sharing and New Issues of Interaction

Computational linguistic prosody rule-based unified technique for automatic metadata generation for Hindi poetry

Export Citation Format

metadata generationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Development of a Web Browser-based Character in Video Metadata Generation Tool

Automatic Metadata Generation for Fish Specimen Image Collections

Automatic Metadata Generation for Fish Specimen Image Collections

Scaling up Cast Face Detection in Videos at Globo

Modelling the Microphone-Related Timbral Brightness of Recorded Signals

An Automated Metadata Generation Method for Data Lake of Industrial WoT Applications

Evaluating the Impact of the Long-S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results

Automatic RDF, Metadata Generation from Legacy Software Models

The Future of International Statistical Data Sharing and New Issues of Interaction

Computational linguistic prosody rule-based unified technique for automatic metadata generation for Hindi poetry

metadata generation
Recently Published Documents