scholarly journals Enhancement of Conventional Beat Tracking System Using Teager–Kaiser Energy Operator

2020 ◽  
Vol 10 (1) ◽  
pp. 379 ◽  
Author(s):  
Matej Istvanek ◽  
Zdenek Smekal ◽  
Lubomir Spurny ◽  
Jiri Mekyska

Beat detection systems are widely used in the music information retrieval (MIR) research field for the computation of tempo and beat time positions in audio signals. One of the most important parts of these systems is usually onset detection. There is an understandable tendency to employ the most accurate onset detector. However, there are options to increase the global tempo (GT) accuracy and also the detection accuracy of beat positions at the expense of less accurate onset detection. The aim of this study is to introduce an enhancement of a conventional beat detector. The enhancement is based on the Teager–Kaiser energy operator (TKEO), which pre-processes the input audio signal before the spectral flux calculation. The proposed approach is first evaluated in terms of the ability to estimate the GT and beat positions accuracy of given audio tracks compared to the same conventional system without the proposed enhancement. The accuracy of the GT and average beat differences (ABD) estimation is tested on the manually labelled reference database. Finally, this system is used for analysis of a string quartet music database. Results suggest that the presence of the TKEO lowers onset detection accuracy but also increases the GT and ABD estimation. The average deviation from the reference GT in the reference database is 9.99 BPM (11.28%), which improves the conventional methodology, where the average deviation is 18.19 BPM (17.74%). This study has a pilot character and provides some suggestions for improving the beat tracking system for music analysis.

Author(s):  
Mina Mounir ◽  
Peter Karsmakers ◽  
Toon van Waterschoot

AbstractIf music is the language of the universe, musical note onsets may be the syllables for this language. Not only do note onsets define the temporal pattern of a musical piece, but their time-frequency characteristics also contain rich information about the identity of the musical instrument producing the notes. Note onset detection (NOD) is the basic component for many music information retrieval tasks and has attracted significant interest in audio signal processing research. In this paper, we propose an NOD method based on a novel feature coined as Normalized Identification of Note Onset based on Spectral Sparsity (NINOS2). The NINOS2 feature can be thought of as a spectral sparsity measure, aiming to exploit the difference in spectral sparsity between the different parts of a musical note. This spectral structure is revealed when focusing on low-magnitude spectral components that are traditionally filtered out when computing note onset features. We present an extensive set of NOD simulation results covering a wide range of instruments, playing styles, and mixing options. The proposed algorithm consistently outperforms the baseline Logarithmic Spectral Flux (LSF) feature for the most difficult group of instruments which are the sustained-strings instruments. It also shows better performance for challenging scenarios including polyphonic music and vibrato performances.


2021 ◽  
Vol 11 (2) ◽  
pp. 851
Author(s):  
Wei-Liang Ou ◽  
Tzu-Ling Kuo ◽  
Chin-Chieh Chang ◽  
Chih-Peng Fan

In this study, for the application of visible-light wearable eye trackers, a pupil tracking methodology based on deep-learning technology is developed. By applying deep-learning object detection technology based on the You Only Look Once (YOLO) model, the proposed pupil tracking method can effectively estimate and predict the center of the pupil in the visible-light mode. By using the developed YOLOv3-tiny-based model to test the pupil tracking performance, the detection accuracy is as high as 80%, and the recall rate is close to 83%. In addition, the average visible-light pupil tracking errors of the proposed YOLO-based deep-learning design are smaller than 2 pixels for the training mode and 5 pixels for the cross-person test, which are much smaller than those of the previous ellipse fitting design without using deep-learning technology under the same visible-light conditions. After the combination of calibration process, the average gaze tracking errors by the proposed YOLOv3-tiny-based pupil tracking models are smaller than 2.9 and 3.5 degrees at the training and testing modes, respectively, and the proposed visible-light wearable gaze tracking system performs up to 20 frames per second (FPS) on the GPU-based software embedded platform.


2020 ◽  
Vol 103 (4) ◽  
pp. 003685042098121
Author(s):  
Ying Zhang ◽  
Hongchang Ding ◽  
Changfu Zhao ◽  
Yigen Zhou ◽  
Guohua Cao

In aircraft manufacturing, the vertical accuracy of connection holes is important indicator of the quality of holes making. Aircraft products have high requirements for the vertical accuracy of holes positions. When drilling and riveting are performed by an automatic robotic system, assembly errors, bumps, offsets and other adverse conditions, can affects the accuracy of manufacturing and detection, and in turn the fatigue performance of the entire structure. To solve this problem, we proposed a technology for detecting the normal-direction based on the adaptive alignment method, built a mathematical model for posture alignment, and studied the calibration method and mechanism of the detection device. Additionally, we investigated techniques for error compensation using an electronic theodolite and other devices when the adaptive method is used for detection. In verification experiments of the method, multiple sets of results demonstrated that the key technical indicators are as follows: normal accuracy <0.5°, average deviation after correction =0.0667°. This method can effectively compensate the errors affecting hole making work in automated manufacturing, and further improve the positioning accuracy and normal-direction detection accuracy of the robot.


2020 ◽  
Vol 10 (3) ◽  
pp. 26
Author(s):  
Mattia Tambaro ◽  
Elia Arturo Vallicelli ◽  
Gerardo Saggese ◽  
Antonio Strollo ◽  
Andrea Baschirotto ◽  
...  

This work presents a comparison between different neural spike algorithms to find the optimum for in vivo implanted EOSFET (electrolyte–oxide-semiconductor field effect transistor) sensors. EOSFET arrays are planar sensors capable of sensing the electrical activity of nearby neuron populations in both in vitro cultures and in vivo experiments. They are characterized by a high cell-like resolution and low invasiveness compared to probes with passive electrodes, but exhibit a higher noise power that requires ad hoc spike detection algorithms to detect relevant biological activity. Algorithms for implanted devices require good detection accuracy performance and low power consumption due to the limited power budget of implanted devices. A figure of merit (FoM) based on accuracy and resource consumption is presented and used to compare different algorithms present in the literature, such as the smoothed nonlinear energy operator and correlation-based algorithms. A multi transistor array (MTA) sensor of 7 honeycomb pixels of a 30 μm2 area is simulated, generating a signal with Neurocube. This signal is then used to validate the algorithms’ performances. The results allow us to numerically determine which is the most efficient algorithm in the case of power constraint in implantable devices and to characterize its performance in terms of accuracy and resource usage.


2021 ◽  
Vol 4 (4) ◽  
Author(s):  
Roman O. Yaroshenko

The visualisation systems are spread widely as personal computer’s software. The system, that are processing audio data are presented in this article. The system visualizes the ratio of spectrum amplitudes and has fixed frequency binding to colours. The technology of audio signals processing by the device and components of the device were considered. For the increasing information processing speed was used 32bit controller and graphic equalizer with seven passbands. Music visualization it is function, that are spread widely in mediaplayer’s software, on a different operation systems. This function shows animated images that are depends on music signal. Images are usually reproduced in the real time mode and synchronized with a played audio-track. Music and visualization are merges in the different kind of art: opera, ballett, music drama or movies. Dependencies of auditory and visual sensations are used for increasing the emotional perseption for ordinary listeners . In the systems, that are currently being actively promoted, are used several tools for personal computers, such as: After Effects – The Audio Spectrum Effect, VSDC Video Editor Free – Audio Spectrum Visualizer, Magic Music Visuals. The software, that are mentioned above, has a one disadvantage: the using of streaming video is not possible with the simultaneous receipt of audio and requires processing and rendering of the resulting video series. The purpose of the work is to determine the features of spectral analysis of music information and taking into account real-time data processing. Propose a variant of the music information visualization system, which displays the spectral composition of music and the amplitude of individual harmonics, and filling the LED-matrix with the appropriate color depending on the amplitude of the audio signal, with the possibility of wireless signal transmission from the music source to the visual effects device. The technology of frequency analysis of the spectrum with estimation of amplitude of spectrum’s components of the musical data, that is arriving on the device is chosen for this project. The method is based on the analysis of the spectrum in the selected frequency bands, which in turn simplifies the function of finding maxima at different frequencies. The proposed variant of the musical information visualization system provides display on the LED-matrix of colors that correspond to the frequencies spectrum’s components in the musical composition. Moreover, the number of involved LEDs is proportional to the ratio of the amplitudes of the signal’s frequency components. The desired result is achieved by using a Fast Fourier Transform and selecting Khan or Heming windows for providing a better analysis results of the signal spectrum. The amplitudes of the individual components of the spectrum are estimated additionally and each frequency band has its own color. The work of the system is to analyze the components of the spectrum and frequency of musical information. This information affects the display of colors on the LED matrix. The using of a 32-bit microcontroller provides sufficient speed of audio signal processing with minimal delays. For the increasing the accuracy and speed up the frequency analysis, the sound range is divided into seven bands. For this purpose was used seven-band graphic equalizer MSGEQ7. Music information is transmitted to the system via Bluetooth, which greatly simplifies the selection and connection of the music data source.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Jingwen Zhang

With the rapid development of information technology and communication, digital music has grown and exploded. Regarding how to quickly and accurately retrieve the music that users want from huge bulk of music repository, music feature extraction and classification are considered as an important part of music information retrieval and have become a research hotspot in recent years. Traditional music classification approaches use a large number of artificially designed acoustic features. The design of features requires knowledge and in-depth understanding in the domain of music. The features of different classification tasks are often not universal and comprehensive. The existing approach has two shortcomings as follows: ensuring the validity and accuracy of features by manually extracting features and the traditional machine learning classification approaches not performing well on multiclassification problems and not having the ability to be trained on large-scale data. Therefore, this paper converts the audio signal of music into a sound spectrum as a unified representation, avoiding the problem of manual feature selection. According to the characteristics of the sound spectrum, the research has combined 1D convolution, gating mechanism, residual connection, and attention mechanism and proposed a music feature extraction and classification model based on convolutional neural network, which can extract more relevant sound spectrum characteristics of the music category. Finally, this paper designs comparison and ablation experiments. The experimental results show that this approach is better than traditional manual models and machine learning-based approaches.


Author(s):  
Anindita Suryarasmi ◽  
Reza Pulungan

AbstrakNotasi musik merupakan dokumentasi tertulis dari sebuah lagu. Walaupun notasi musik telah umum digunakan, namun tidak semua orang yang berkecimpung di dalam dunia musik memahami bagaimana notasi musik dituliskan. Penelitian ini menawarkan penyusunan notasi music secara otomatis dengan mengimplementasikan metode onset detection. Hal mendasar yang harus diketahui dalam pembuatan notasi musik adalah durasi serta nada yang dimainkan. Dengan menggunakan mendeteksi onset dari data audio, jarak antar pukulan dapat diketahui. Dengan demikian maka durasi permainan pun bisa dihitung. Hasil dari pencarian durasi tersebut diolah kembali untuk menciptakan objek-objek note yang disusun dalam notasi musik. Sistem menghasilkan keluaran berupa file dengan format musicXML. Dengan format ini maka hasil keluaran sistem akan bersifat dinamis dan dapat diolah kembali dengan music editor yang mendukung format file tersebut.Hasil penelitian menunjukkan akurasi yang tinggi dalam pengenalan pola permainan yang berhubungan dengan durasi setiap note hingga mencapai 99.62%.  Kata kunci— notasi musik, onset detection, musicXML  AbstractMusical notation is written documentation of a music. Even though musical notation is commonly used, not every musician knows how to write a musical notation. This work offers automatic musical notation generation from audio signal using onset detection.Duration and pitch of the notes are two basic parameters that have to be known in order to generate music notation. This work implemented onset detection method to recognize the pattern by measuring the interval between two notes. Using the interval, the duration of each notes can be calculated and used to create note objects in order to arrange a musical notation. The output of the system is a musicXML formatted file. This format allowed the output to be edited using software for music editor. The result of this work shows high accuracy up to 99.62% for detecting each notes and measuring the duration. Keywords— musical notation, onset detection, musicXML


2020 ◽  
Vol 4 (2) ◽  
pp. 30-32
Author(s):  
Alberto Moreno

Thus, for the current status of research and practical music audio processing needs, this paper argues, the music element analysis technology is the key to this research field, and on this basis, proposes a new framework music processing – Music calculation system, the core objective is to study intelligently and automatically identifies various elements of music information and analyze the information used in constructing the music content, and intelligent retrieval method translated. To achieve the above core research objectives, the paper advocates will be closely integrated music theory and calculation methods, the promotion of integrated use of music theory, cognitive psychology, music, cognitive science, neuroscience, artificial intelligence, signal processing theory to solve the music signal analysis identify the problem.


Author(s):  
Olexandr Mazur

In modern high-technology conditions entire process of creating, circulation and distribution of the auditory information, accumulated in phonograms, inherits integration properties that stipulate internal unity of the structure and dynamics of the phonodocumental communication. The research, performed within the framework of Document studies, Archive studies and Musical Art, dedicated to the development of the sound recording classification problem in communication area from the position of archival phonograms selection on the musical radio as a special type of service auditory documents with specific features — types of sound carriers, specifics of the fixed sound content and specifics of the service functions. The publication develops the applying of the informational approach archival radio phonodocuments that currently stay on the periphery of the research interest even comparing to other types of the tecehnotronic documents — pictorial, audiovisual, electronic information sources, different types of the technical documentation.Phonodocumental communication has features of the complicated open area. According to the author’s idea disconnected elements from the radio sector, that were organically collected together and integrated to the balanced phonodocumental communication system. The essence and the purpose of such types of the phonoduments are clarified, also the main regularities of the formation and principles of the classification were formulated. Moreover, based on the fact that radioarchives and some other cultural institutions perform functional of the music information complex repository, this article tells about music sources preservation technologies not only in the radio copmanies archives but in libraries, museums, etc. The expansion of the research field at the expense of the advanced systematization of advantages and disadvantages of the archival radiophonogram consumer format revision was begun. The research approach that was the basis for publication helps to overcome some differences including modern science and educational literature. Keywords: Archive, Audiodocuments, Music, Radio, Service,Phonogram.


2019 ◽  
Author(s):  
Willy Cornelissen ◽  
Maurício Loureiro

A very significant task for music research is to estimate instants when meaningful events begin (onset) and when they end (offset). Onset detection is widely applied in many fields: electrocardiograms, seismographic data, stock market results and many Music Information Research(MIR) tasks, such as Automatic Music Transcription, Rhythm Detection, Speech Recognition, etc. Automatic Onset Detection(AOD) received, recently, a huge contribution coming from Artificial Intelligence (AI) methods, mainly Machine Learning and Deep Learning. In this work, the use of Convolutional Neural Networks (CNN) is explored by adapting its original architecture in order to apply the approach to automatic onset detection on audio musical signals. We used a CNN network for onset detection on a very general dataset, well acknowledged by the MIR community, and examined the accuracy of the method by comparison to ground truth data published by the dataset. The results are promising and outperform another methods of musical onset detection.


Sign in / Sign up

Export Citation Format

Share Document