The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy

Automatic Music Transcription: An Overview

IEEE Signal Processing Magazine ◽

10.1109/msp.2018.2869928 ◽

2019 ◽

Vol 36 (1) ◽

pp. 20-30 ◽

Cited By ~ 19

Author(s):

Emmanouil Benetos ◽

Simon Dixon ◽

Zhiyao Duan ◽

Sebastian Ewert

Keyword(s):

Music Transcription ◽

Automatic Music Transcription

Get full-text (via PubEx)

Chord-aware automatic music transcription based on hierarchical Bayesian integration of acoustic and language models

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2018.17 ◽

2018 ◽

Vol 7 ◽

Author(s):

Yuta Ojima ◽

Eita Nakamura ◽

Katsutoshi Itoyama ◽

Kazuyoshi Yoshii

Keyword(s):

Latent Variables ◽

Language Model ◽

Language Models ◽

Sequential Dependency ◽

Acoustic Model ◽

Hierarchical Bayesian ◽

Generative Process ◽

Music Transcription ◽

Automatic Music Transcription ◽

Music Audio

This paper describes automatic music transcription with chord estimation for music audio signals. We focus on the fact that concurrent structures of musical notes such as chords form the basis of harmony and are considered for music composition. Since chords and musical notes are deeply linked with each other, we propose joint pitch and chord estimation based on a Bayesian hierarchical model that consists of an acoustic model representing the generative process of a spectrogram and a language model representing the generative process of a piano roll. The acoustic model is formulated as a variant of non-negative matrix factorization that has binary variables indicating a piano roll. The language model is formulated as a hidden Markov model that has chord labels as the latent variables and emits a piano roll. The sequential dependency of a piano roll can be represented in the language model. Both models are integrated through a piano roll in a hierarchical Bayesian manner. All the latent variables and parameters are estimated using Gibbs sampling. The experimental results showed the great potential of the proposed method for unified music transcription and grammar induction.

Get full-text (via PubEx)

Automatic music transcription supporting different instruments

Proceedings Third International Conference on WEB Delivering of Music ◽

10.1109/wdm.2003.1233871 ◽

2004 ◽

Cited By ~ 1

Author(s):

I. Bruno ◽

S.L. Monni ◽

P. Nesi

Keyword(s):

Music Transcription ◽

Automatic Music Transcription

Get full-text (via PubEx)

A Divide and Conquer Approach to Automatic Music Transcription Using Neural Networks

Progress in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-030-30244-3_19 ◽

2019 ◽

pp. 220-231

Author(s):

André Gil ◽

Carlos Grilo ◽

Gustavo Reis ◽

Patrício Domingues

Keyword(s):

Neural Networks ◽

Divide And Conquer ◽

Music Transcription ◽

Automatic Music Transcription

Get full-text (via PubEx)

Multi-Instrument Automatic Music Transcription With Self-Attention-Based Instance Segmentation

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2020.3030482 ◽

2020 ◽

Vol 28 ◽

pp. 2796-2809

Author(s):

Yu-Te Wu ◽

Berlin Chen ◽

Li Su

Keyword(s):

Music Transcription ◽

Automatic Music Transcription ◽

Instance Segmentation

Get full-text (via PubEx)

Automatic music transcription based on convolutional neural network, constant Q transform and MFCC

Journal of Physics Conference Series ◽

10.1088/1742-6596/1651/1/012192 ◽

2020 ◽

Vol 1651 ◽

pp. 012192

Author(s):

Zhihang Meng ◽

Wencheng Chen

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Music Transcription ◽

Automatic Music Transcription

Get full-text (via PubEx)

A Shift-Invariant Latent Variable Model for Automatic Music Transcription

Computer Music Journal ◽

10.1162/comj_a_00146 ◽

2012 ◽

Vol 36 (4) ◽

pp. 81-94 ◽

Cited By ~ 36

Author(s):

Emmanouil Benetos ◽

Simon Dixon

Keyword(s):

Latent Variable ◽

Markov Models ◽

Variable Model ◽

Data Set ◽

Music Transcription ◽

Transcription System ◽

Automatic Music Transcription ◽

Error Metrics ◽

Frequency Modulations ◽

Time Varying Pitch

In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics.

Get full-text (via PubEx)

Software Tool for Audio Signal Analysis and Automatic Music Transcription

2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics (SACI) ◽

10.1109/saci.2018.8440966 ◽

2018 ◽

Author(s):

Lucian-Gheorghe Chis ◽

Marius Marcu ◽

Florin Dragan

Keyword(s):

Signal Analysis ◽

Audio Signal ◽

Software Tool ◽

Music Transcription ◽

Automatic Music Transcription ◽

Audio Signal Analysis

Get full-text (via PubEx)

Automatic music transcription software based on constant Q transform

2016 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI) ◽

10.1109/ecai.2016.7861193 ◽

2016 ◽

Cited By ~ 3

Author(s):

Robert Alexandru Dobre ◽

Cristian Negrescu

Keyword(s):

Music Transcription ◽

Automatic Music Transcription

Get full-text (via PubEx)

Multimodal image and audio music transcription

International Journal of Multimedia Information Retrieval ◽

10.1007/s13735-021-00221-6 ◽

2021 ◽

Author(s):

Carlos de la Fuente ◽

Jose J. Valero-Mas ◽

Francisco J. Castellanos ◽

Jorge Calvo-Zaragoza

Keyword(s):

Local Alignment ◽

Digital Representation ◽

Sheet Music ◽

Optical Music Recognition ◽

Music Transcription ◽

Relative Improvement ◽

Research Fields ◽

Automatic Music Transcription ◽

Music Recognition ◽

The Individual

AbstractOptical Music Recognition (OMR) and Automatic Music Transcription (AMT) stand for the research fields that aim at obtaining a structured digital representation from sheet music images and acoustic recordings, respectively. While these fields have traditionally evolved independently, the fact that both tasks may share the same output representation poses the question of whether they could be combined in a synergistic manner to exploit the individual transcription advantages depicted by each modality. To evaluate this hypothesis, this paper presents a multimodal framework that combines the predictions from two neural end-to-end OMR and AMT systems by considering a local alignment approach. We assess several experimental scenarios with monophonic music pieces to evaluate our approach under different conditions of the individual transcription systems. In general, the multimodal framework clearly outperforms the single recognition modalities, attaining a relative improvement close to $$40\%$$ 40 % in the best case. Our initial premise is, therefore, validated, thus opening avenues for further research in multimodal OMR-AMT transcription.

Get full-text (via PubEx)