scholarly journals Singing Transcription from Polyphonic Music Using Melody Contour Filtering

2021 ◽  
Vol 11 (13) ◽  
pp. 5913
Author(s):  
Zhuang He ◽  
Yin Feng

Automatic singing transcription and analysis from polyphonic music records are essential in a number of indexing techniques for computational auditory scenes. To obtain a note-level sequence in this work, we divide the singing transcription task into two subtasks: melody extraction and note transcription. We construct a salience function in terms of harmonic and rhythmic similarity and a measurement of spectral balance. Central to our proposed method is the measurement of melody contours, which are calculated using edge searching based on their continuity properties. We calculate the mean contour salience by separating melody analysis from the adjacent breakpoint connective strength matrix, and we select the final melody contour to determine MIDI notes. This unique method, combining audio signals with image edge analysis, provides a more interpretable analysis platform for continuous singing signals. Experimental analysis using Music Information Retrieval Evaluation Exchange (MIREX) datasets shows that our technique achieves promising results both for audio melody extraction and polyphonic singing transcription.

2016 ◽  
Vol 40 (2) ◽  
pp. 70-83 ◽  
Author(s):  
Valerio Velardo ◽  
Mauro Vallati ◽  
Steven Jan

Fostered by the introduction of the Music Information Retrieval Evaluation Exchange (MIREX) competition, the number of systems that calculate symbolic melodic similarity has recently increased considerably. To understand the state of the art, we provide a comparative analysis of existing algorithms. The analysis is based on eight criteria that help to characterize the systems, highlighting strengths and weaknesses. We also propose a taxonomy that classifies algorithms based on their approach. Both taxonomy and criteria are fruitfully exploited to provide input for new, forthcoming research in the area.


2019 ◽  
Vol 9 (23) ◽  
pp. 5121 ◽  
Author(s):  
Olivier Lartillot ◽  
Didier Grandjean

We present a method for tempo estimation from audio recordings based on signal processing and peak tracking, and not depending on training on ground-truth data. First, an accentuation curve, emphasizing the temporal location and accentuation of notes, is based on a detection of bursts of energy localized in time and frequency. This enables the detection of notes in dense polyphonic texture, while ignoring spectral fluctuation produced by vibrato and tremolo. Periodicities in the accentuation curve are detected using an improved version of autocorrelation function. Hierarchical metrical structures, composed of a large set of periodicities in pairwise harmonic relationships, are tracked over time. In this way, the metrical structure can be tracked even if the rhythmical emphasis switches from one metrical level to another. This approach, compared to all the other participants to the Music Information Retrieval Evaluation eXchange (MIREX) Audio Tempo Extraction competition from 2006 to 2018, is the third best one among those that can track tempo variations. While the two best methods are based on machine learning, our method suggests a way to track tempo founded on signal processing and heuristics-based peak tracking. Moreover, the approach offers for the first time a detailed representation of the dynamic evolution of the metrical structure. The method is integrated into MIRtoolbox, a Matlab toolbox freely available.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1518
Author(s):  
António S. Pinto ◽  
Sebastian Böck ◽  
Jaime S. Cardoso ◽  
Matthew E. P. Davies

The extraction of the beat from musical audio signals represents a foundational task in the field of music information retrieval. While great advances in performance have been achieved due the use of deep neural networks, significant shortcomings still remain. In particular, performance is generally much lower on musical content that differs from that which is contained in existing annotated datasets used for neural network training, as well as in the presence of challenging musical conditions such as rubato. In this paper, we positioned our approach to beat tracking from a real-world perspective where an end-user targets very high accuracy on specific music pieces and for which the current state of the art is not effective. To this end, we explored the use of targeted fine-tuning of a state-of-the-art deep neural network based on a very limited temporal region of annotated beat locations. We demonstrated the success of our approach via improved performance across existing annotated datasets and a new annotation-correction approach for evaluation. Furthermore, we highlighted the ability of content-specific fine-tuning to learn both what is and what is not the beat in challenging musical conditions.


2016 ◽  
Vol 72 (5) ◽  
pp. 858-877 ◽  
Author(s):  
Xiao Hu ◽  
Jin Ha Lee

Purpose The purpose of this paper is to compare music mood perceptions of people with diverse cultural backgrounds when they interact with Chinese music. It also discusses how the results can inform the design of global music digital libraries (MDL). Design/methodology/approach An online survey was designed based on the Music Information Retrieval Evaluation eXchange (MIREX) five-cluster mood model, to solicit mood perceptions of listeners in Hong Kong and the USA on a diverse set of Chinese music. Statistical analysis was applied to compare responses from the two user groups, with consideration of different music types and characteristics of listeners. Listeners’ textual responses were also analyzed with content coding. Findings Listeners from the two cultural groups made different mood judgments on all but one type of Chinese music. Hong Kong listeners reached higher levels of agreement on mood judgments than their US counterparts. Gender, age and familiarity with the songs were related to listeners’ mood judgment to some extent. Practical implications The MIREX five-cluster model may not be sufficient for representing the mood of Chinese music. Refinements are suggested. MDL are recommended to differentiate tags given by users from different cultural groups, and to differentiate music types when classifying or recommending Chinese music by mood. Originality/value It is the first study on cross-cultural access to Chinese music in MDL. Methods and the refined mood model can be applied to cross-cultural access to other music types and information objects.


Sign in / Sign up

Export Citation Format

Share Document