scholarly journals A Shift-Invariant Latent Variable Model for Automatic Music Transcription

2012 ◽  
Vol 36 (4) ◽  
pp. 81-94 ◽  
Author(s):  
Emmanouil Benetos ◽  
Simon Dixon

In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics.

Author(s):  
Alexander Baturo ◽  
Johan A. Elkink

Abstract How can one assess which countries select more experienced leaders for the highest office? There is wide variation in prior career paths of national leaders within, and even more so between, regime types. It is therefore challenging to obtain a truly comparative measure of political experience; empirical studies have to rely on proxies instead. This article proposes PolEx, a measure of political experience that abstracts away from the details of career paths and generalizes based on the duration, quality and breadth of an individual's experience in politics. The analysis draws on a novel data set of around 2,000 leaders from 1950 to 2017 and uses a Bayesian latent variable model to estimate PolEx. The article illustrates how the new measure can be used comparatively to assess whether democracies select more experienced leaders. The authors find that while on average they do, the difference with non-democracies has declined dramatically since the early 2000s. Future research may leverage PolEx to investigate the role of prior political experience in, for example, policy making and crisis management.


2013 ◽  
Vol 300-301 ◽  
pp. 848-852
Author(s):  
Zong Hai Sun ◽  
Osman Osman

Data sets of high–dimensional spaces are problematic when it comes to classification, compression, and visualization. The main issue is to find a reduced dimensionality representation that corresponds to the intrinsic dimensionality of the original data. In this paper we try to investigate a practical Bayesian method for feature extracting problem, in particular we will apply Gaussian Process Latent Variable Model (GPLVM) to a real world data set. Feature extraction experiments were performed on a cancer treatments’ components data set using GPLVM, then we used PCA on the same data set for comparison of the results.


2012 ◽  
Vol 2012 ◽  
pp. 1-13
Author(s):  
Yi Guo ◽  
Jiyong Tang

This paper presents a combined mathematical treatment for a special automatic music transcription system. This system is specially made for computer-synthesized music. The combined mathematical treatment includes harmonic selection, matrix analysis, and probability analysis method. The algorithm reduces dimension by PCA and selects candidates first by human auditory model and harmonic structures of notes. It changes the multiple-F0 estimation question into a mathematical problem and solves it in a mathematical way. It can be shown in this paper that the experimental results indicate that this method has very good recognition results.


2015 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Automatic music transcription is a difficult task that has provoked extensive research on transcription systems that are predominantly general purpose, processing any number or type of instruments sounding simultaneously. This paper presents a polyphonic transcription system that is constrained to processing the output of a single instrument with an upper bound on polyphony. For example, a guitar has six strings and is limited to producing six notes simultaneously. The transcription system consists of a novel pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each audio analysis frame, such that the polyphony does not exceed that of the instrument. The implemented transcription system is evaluated on a compiled dataset of synthesized guitar recordings. Comparing these results to a prior single-instrument polyphonic transcription system that received exceptional results, this paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.


2015 ◽  
Author(s):  
Gregory Burlet ◽  
Abram Hindle

Automatic music transcription is a difficult task that has provoked extensive research on transcription systems that are predominantly general purpose, processing any number or type of instruments sounding simultaneously. This paper presents a polyphonic transcription system that is constrained to processing the output of a single instrument with an upper bound on polyphony. For example, a guitar has six strings and is limited to producing six notes simultaneously. The transcription system consists of a novel pitch estimation algorithm that uses a deep belief network and multi-label learning techniques to generate multiple pitch estimates for each audio analysis frame, such that the polyphony does not exceed that of the instrument. The implemented transcription system is evaluated on a compiled dataset of synthesized guitar recordings. Comparing these results to a prior single-instrument polyphonic transcription system that received exceptional results, this paper demonstrates the effectiveness of deep, multi-label learning for the task of polyphonic transcription.


Author(s):  
Dorian Cazau ◽  
Marc Chemillier ◽  
Olivier Adam

This chapter presents an original approach for the development of an automatic music transcription system of a Malagasy traditional plucked string instrument, called marovany zither. Our approach is based on a technology of multichannel capturing sensory system, which allows breaking down a complex polyphonic audio signal into a sum of monophonic sensor signals. A very high precision in transcription is obtained, i.e. & gt; 95% on the average note-based F-measure metric. A second part of this chapter consists in using these transcripts in the human-machine improvisation system ImproteK. Details of an exploratory working session with a local Malagasy musician are reported and discussed.


2005 ◽  
Vol 2 (2) ◽  
Author(s):  
Silvia Cagnone ◽  
Roberto Ricci

The aim of this work is to analyze a part of the data collected in the Computer Science Department during the Informatics exams in the year 2003. Two different Item Response Theory models for ordered polytomous variables are considered in order to get an evaluation of student ability. Ordered polytomous variables are used for a problem solving process that contains a finite number of steps so that the ability of a student can be evaluated on the basis of the step achieved, namely, higher steps achieved are related to higher ability. The models considered are the Partial Credit Model and the Graded Response Model. The choice of these models has been dictated by the fact that although they are defined into different theoretical frameworks, the former belongs to the Rasch family (Masters, 1982) and the latter can be viewed as a Generalized Linear Latent Variable Model (Bartholomew and Knott, 1999), and hence they present different properties, both of them allow to treat ordinal observed variables. The analysis of the real data set through the two approaches allows to highlight their advantages and disadvantages.


Sign in / Sign up

Export Citation Format

Share Document