scholarly journals Isolated instrument transcription using a deep belief network

Author(s):  
Gregory Burlet ◽  
Abram Hindle

Automatic music transcription is a difficult task that has provoked extensive research on transcription systems that are predominantly general purpose, processing any number or type of instruments sounding simultaneously. This paper presents a polyphonic transcription system that is constrained to processing the output of a single instrument with an upper bound on polyphony. For example, a guitar has six strings and is limited to producing six notes simultaneously. The transcription system consists of a novel pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each audio analysis frame, such that the polyphony does not exceed that of the instrument. The implemented transcription system is evaluated on a compiled dataset of synthesized guitar recordings. Comparing these results to a prior single-instrument polyphonic transcription system that received exceptional results, this paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.

2015 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Automatic music transcription is a difficult task that has provoked extensive research on transcription systems that are predominantly general purpose, processing any number or type of instruments sounding simultaneously. This paper presents a polyphonic transcription system that is constrained to processing the output of a single instrument with an upper bound on polyphony. For example, a guitar has six strings and is limited to producing six notes simultaneously. The transcription system consists of a novel pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each audio analysis frame, such that the polyphony does not exceed that of the instrument. The implemented transcription system is evaluated on a compiled dataset of synthesized guitar recordings. Comparing these results to a prior single-instrument polyphonic transcription system that received exceptional results, this paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.


2017 ◽  
Vol 3 ◽  
pp. e109 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Music transcription involves the transformation of an audio recording to common music notation, colloquially referred to as sheet music. Manually transcribing audio recordings is a difficult and time-consuming process, even for experienced musicians. In response, several algorithms have been proposed to automatically analyze and transcribe the notes sounding in an audio recording; however, these algorithms are often general-purpose, attempting to process any number of instruments producing any number of notes sounding simultaneously. This paper presents a polyphonic transcription algorithm that is constrained to processing the audio output of a single instrument, specifically an acoustic guitar. The transcription system consists of a novel note pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each analysis frame of the input audio signal. Using a compiled dataset of synthesized guitar recordings for evaluation, the algorithm described in this work results in an 11% increase in the f-measure of note transcriptions relative to Zhou et al.’s (2009) transcription algorithm in the literature. This paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.


2012 ◽  
Vol 36 (4) ◽  
pp. 81-94 ◽  
Author(s):  
Emmanouil Benetos ◽  
Simon Dixon

In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics.


Author(s):  
Sanjiban Sekhar Roy ◽  
Pulkit Kulshrestha ◽  
Pijush Samui

Drought is a condition of land in which the ground water faces a severe shortage. This condition affects the survival of plants and animals. Drought can impact ecosystem and agricultural productivity, severely. Hence, the economy also gets affected by this situation. This paper proposes Deep Belief Network (DBN) learning technique, which is one of the state of the art machine learning algorithms. This proposed work uses DBN, for classification of drought and non-drought images. Also, k nearest neighbour (kNN) and random forest learning methods have been proposed for the classification of the same drought images. The performance of the Deep Belief Network(DBN) has been compared with k nearest neighbour (kNN) and random forest. The data set has been split into 80:20, 70:30 and 60:40 as train and test. Finally, the effectiveness of the three proposed models have been measured by various performance metrics.


2012 ◽  
Vol 2012 ◽  
pp. 1-13
Author(s):  
Yi Guo ◽  
Jiyong Tang

This paper presents a combined mathematical treatment for a special automatic music transcription system. This system is specially made for computer-synthesized music. The combined mathematical treatment includes harmonic selection, matrix analysis, and probability analysis method. The algorithm reduces dimension by PCA and selects candidates first by human auditory model and harmonic structures of notes. It changes the multiple-F0 estimation question into a mathematical problem and solves it in a mathematical way. It can be shown in this paper that the experimental results indicate that this method has very good recognition results.


Author(s):  
Dorian Cazau ◽  
Marc Chemillier ◽  
Olivier Adam

This chapter presents an original approach for the development of an automatic music transcription system of a Malagasy traditional plucked string instrument, called marovany zither. Our approach is based on a technology of multichannel capturing sensory system, which allows breaking down a complex polyphonic audio signal into a sum of monophonic sensor signals. A very high precision in transcription is obtained, i.e. & gt; 95% on the average note-based F-measure metric. A second part of this chapter consists in using these transcripts in the human-machine improvisation system ImproteK. Details of an exploratory working session with a local Malagasy musician are reported and discussed.


2020 ◽  
Vol 8 ◽  
Author(s):  
Carlos Hernández Oliván ◽  
Ignacio Zay Pinilla ◽  
José Ramón Beltrán Blázquez

Note Tracking (NT) is a subtask of Automatic Music Transcription (AMT) which is a critical problem in the field of Music Information Retrieval (MIR). The aim of this work is to compare the performance of two models, one for onsets and frames prediction and another one with pitch detection and a note tracking algorithm in order to study the behaviour of different timbres and families of instruments in note tracking subtasks.


Sign in / Sign up

Export Citation Format

Share Document