Incorporation of phase information for improved time-dependent instrument recognition

AbstractTime-dependent estimation of playing instruments in music recordings is an important preprocessing for several music signal processing algorithms. In this approach, instrument recognition is realized by neural networks with a two-dimensional input of short-time Fourier transform (STFT) magnitudes and a time-frequency representation based on phase information. The modified group delay (MODGD) function and the product spectrum (PS), which is based on MODGD, are analysed as phase representations. Training and evaluation processes are executed based on the MusicNet dataset. By the incorporation of PS in the input, instrument recognition can be improved about 2% in F1-score.

Download Full-text

Influence of input data representations for time-dependent instrument recognition

tm - Technisches Messen ◽

10.1515/teme-2020-0100 ◽

2021 ◽

Vol 88 (5) ◽

pp. 274-281

Author(s):

Markus Schwabe ◽

Michael Heizmann

Keyword(s):

Signal Processing ◽

Three Dimensional ◽

Time Dependent ◽

Frequency Error ◽

Additional Time ◽

Time Frequency ◽

Music Signal ◽

Frequency Representation ◽

Instrument Recognition ◽

Signal Processing Algorithms

Abstract An important preprocessing step for several music signal processing algorithms is the estimation of playing instruments in music recordings. To this aim, time-dependent instrument recognition is realized by a neural network with residual blocks in this approach. Since music signal processing tasks use diverse time-frequency representations as input matrices, the influence of different input representations for instrument recognition is analyzed in this work. Three-dimensional inputs of short-time Fourier transform (STFT) magnitudes and an additional time-frequency representation based on phase information are investigated as well as two-dimensional STFT or constant-Q transform (CQT) magnitudes. As additional phase representations, the product spectrum (PS), based on the modified group delay, and the frequency error (FE) matrix, related to the instantaneous frequency, are used. Training and evaluation processes are executed based on the MusicNet dataset, which enables the estimation of seven instruments. With a higher number of frequency bins in the input representations, an improved instrument recognition of about 2 % in F1-score can be achieved. Compared to the literature, frame-level instrument recognition can be improved for different input representations.

Download Full-text

Strain Sensitivity Enhancement of Broadband Ultrasonic Signals in Plates Using Spectral Phase Filtering

Applied Sciences ◽

10.3390/app11062582 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2582

Author(s):

Lucas M. Martinho ◽

Alan C. Kubrusly ◽

Nicolás Pérez ◽

Jean Pierre von der Weid

Keyword(s):

Fourier Transform ◽

Guided Waves ◽

Strain Level ◽

Energy Concentration ◽

Short Time Fourier Transform ◽

Time Frequency ◽

Frequency Representation ◽

The Fourier Transform ◽

Short Time ◽

Sensitivity Gain

The focused signal obtained by the time-reversal or the cross-correlation techniques of ultrasonic guided waves in plates changes when the medium is subject to strain, which can be used to monitor the medium strain level. In this paper, the sensitivity to strain of cross-correlated signals is enhanced by a post-processing filtering procedure aiming to preserve only strain-sensitive spectrum components. Two different strategies were adopted, based on the phase of either the Fourier transform or the short-time Fourier transform. Both use prior knowledge of the system impulse response at some strain level. The technique was evaluated in an aluminum plate, effectively providing up to twice higher sensitivity to strain. The sensitivity increase depends on a phase threshold parameter used in the filtering process. Its performance was assessed based on the sensitivity gain, the loss of energy concentration capability, and the value of the foreknown strain. Signals synthesized with the time–frequency representation, through the short-time Fourier transform, provided a better tradeoff between sensitivity gain and loss of energy concentration.

Download Full-text

A Modified S-Transform for EEG Signals Analysis

International Journal of Computational Methods ◽

10.1142/s0219876215500218 ◽

2015 ◽

Vol 12 (03) ◽

pp. 1550021 ◽

Cited By ~ 2

Author(s):

M. A. Al-Manie ◽

W. J. Wang

Keyword(s):

Image Processing ◽

Fourier Transform ◽

Energy Concentration ◽

Biomedical Data ◽

Short Time Fourier Transform ◽

Phase Information ◽

Eeg Signals ◽

Time Frequency ◽

S Transform ◽

Short Time

Due to the advantages offered by the S-transform (ST) distribution, it has been recently successfully implemented for various applications such as seismic and image processing. The desirable properties of the ST include a globally referenced phase as the case with the short time Fourier transform (STFT) while offering a higher spectral resolution as the wavelet transform (WT). However, this estimator suffers from some inherent disadvantages seen as poor energy concentration with higher frequencies. In order to improve the performance of the distribution, a modification to the existing technique is proposed. Additional parameters are proposed to control the window's width which can greatly enhance the signal representation in the time–frequency plane. The new estimator's performance is evaluated using synthetic signals as well as biomedical data. The required features of the ST which include invertability and phase information are still preserved.

Download Full-text

Evaluation of potential auras in generalized epilepsy from EEG signals using deep convolutional neural networks and time-frequency representation

Biomedical Engineering / Biomedizinische Technik ◽

10.1515/bmt-2019-0098 ◽

2020 ◽

Vol 65 (4) ◽

pp. 379-391 ◽

Cited By ~ 2

Author(s):

Hasan Polat ◽

Mehmet Ufuk Aluçlu ◽

Mehmet Siraç Özerdem

Keyword(s):

Eeg Signals ◽

Deep Convolutional Neural Networks ◽

One Dimensional ◽

Time Frequency ◽

Specificity And Sensitivity ◽

Frequency Representation ◽

Proposed Model ◽

Generalized Epilepsy ◽

Short Time

AbstractThe general uncertainty of epilepsy and its unpredictable seizures often affect badly the quality of life of people exposed to this disease. There are patients who can be considered fortunate in terms of prediction of any seizures. These are patients with epileptic auras. In this study, it was aimed to evaluate pre-seizure warning symptoms of the electroencephalography (EEG) signals by a convolutional neural network (CNN) inspired by the epileptic auras defined in the medical field. In this context, one-dimensional EEG signals were transformed into a spectrogram display form in the frequency-time domain by applying a short-time Fourier transform (STFT). Systemic changes in pre-epileptic seizure have been described by applying the CNN approach to the EEG signals represented in the image form, and the subjective EEG-Aura process has been tried to be determined for each patient. Considering all patients included in the evaluation, it was determined that the 1-min interval covering the time from the second minute to the third minute before the seizure had the highest mean and the lowest variance to determine the systematic changes before the seizure. Thus, the highest performing process is described as EEG-Aura. The average success for the EEG-Aura process was 90.38 ± 6.28%, 89.78 ± 8.34% and 90.47 ± 5.95% for accuracy, specificity and sensitivity, respectively. Through the proposed model, epilepsy patients who do not respond to medical treatment methods are expected to maintain their lives in a more comfortable and integrated way.

Download Full-text

A new time-frequency representation for music signal analysis: Resonator time-frequency image

10.1109/isspa.2007.4555594 ◽

2007 ◽

Cited By ~ 4

Author(s):

Ruohua Zhou ◽

Marco Mattavelli

Keyword(s):

Signal Analysis ◽

Time Frequency ◽

Music Signal ◽

Frequency Representation ◽

New Time

Download Full-text

A review of time–frequency representations, with application to sound/music analysis–resynthesis

Organised Sound ◽

10.1017/s1355771898009042 ◽

1997 ◽

Vol 2 (3) ◽

pp. 193-205 ◽

Cited By ~ 3

Author(s):

PAUL MASRI ◽

ANDREW BATEMAN ◽

NISHAN CANAGARAJAH

Keyword(s):

Music Analysis ◽

Intermediate Representation ◽

Short Time Fourier Transform ◽

Time Frequency ◽

Time Domain Signal ◽

Analysis Process ◽

Frequency Representation ◽

The Time Domain ◽

Higher Order Spectra ◽

Short Time

Analysis–resynthesis (A–R) systems gain their flexibility for creative transformation of sound by representing sound as a set of musically useful features. The analysis process extracts these features from the time domain signal by means of a time–frequency representation (TFR). The TFR provides an intermediate representation of sound that must make the features accessible and measurable to the rest of the analysis. Until very recently, the short-time Fourier transform (STFT) has been the obvious choice for time–frequency representation, despite its limitations in terms of resolution. Recent and ongoing developments are providing several alternative schemes that allow for a more considered choice of TFR. This paper reviews these contemporary approaches in comparison with the more classical ones and with reference to their applicability, merits and shortcomings for application to sound analysis. (Where they have been successfully applied, details are provided.) The techniques reviewed include linear, bilinear and higher-order spectra, nonparametric and parametric methods and some sound-model-specific TFRs.

Download Full-text

Time-frequency Analysis of Multicomponent LFM signal based on Hough and Chirplet Transform

MATEC Web of Conferences ◽

10.1051/matecconf/201817303054 ◽

2018 ◽

Vol 173 ◽

pp. 03054

Author(s):

Xueqin Zhang ◽

Ruolun Liu

Keyword(s):

Fourier Transform ◽

Line Detection ◽

Short Time Fourier Transform ◽

Time Frequency ◽

Frequency Representation ◽

Chirplet Transform ◽

Frequency Map ◽

Linear Frequency Modulated Signal ◽

Short Time

The Chirplet Transform (CT) is effective in the characterization of IF for mono-component linear-frequency-modulated signal. However, During the initialization process, using the peak of the time-frequency map of the short-time Fourier transform to fit the line is greatly affected by noise. For the multi-component signals, it is more difficult to distinguish and fit different IF lines. Since the Hough is good at a common algorithm for the line detection, the ridge edge fitting is replaced by the Hough transform in this paper. The experiment results show significant improvement in the obtained time-frequency representation.

Download Full-text

Applications of an Improved Time-Frequency Filtering Algorithm to Signal Reconstruction

Mathematical Problems in Engineering ◽

10.1155/2017/1805091 ◽

2017 ◽

Vol 2017 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Junbo Long ◽

Haibin Wang ◽

Daifeng Zha ◽

Hongshe Fan ◽

Zefeng Lao ◽

...

Keyword(s):

Fourier Transform ◽

Stable Distribution ◽

Signal Reconstruction ◽

Order Moment ◽

Frequency Filtering ◽

Short Time Fourier Transform ◽

Time Frequency ◽

Noise Environment ◽

Frequency Representation ◽

Short Time

The short time Fourier transform time-frequency representation (STFT-TFR) method degenerates, and the corresponding short time Fourier transform time-frequency filtering (STFT-TFF) method fails underαstable distribution noise environment. A fractional low order short time Fourier transform (FLOSTFT) which takes advantage of fractionalporder moment is proposed forαstable distribution noise environment, and the corresponding FLOSTFT time-frequency representation (FLOSTFT-TFR) algorithm is presented in this paper. We study vector formulation of the FLOSTFT and inverse FLOSTFT (IFLOSTFT) methods and propose a FLOSTFT time-frequency filtering (FLOSTFT-TFF) method which takes advantage of time-frequency localized spectra of the signal in time-frequency domain. The simulation results show that, employing the FLOSTFT-TFR method and the FLOSTFT-TFF method with an adaptive weight function, time-frequency distribution of the signals can be better gotten and time-frequency localized region of the signal can be effectively extracted fromαstable distribution noise, and also the original signal can be restored employing the IFLOSTFT method. Their performances are better than the STFT-TFR and STFT-TFF methods, and MSEs are smaller in differentαand GSNR cases. Finally, we apply the FLOSTFT-TFR and FLOSTFT-TFF methods to extract fault features of the bearing outer race fault signal and restore the original fault signal fromαstable distribution noise; the experimental results illustrate their performances.

Download Full-text

Enhancement of Maternal ECG Using Short-Term Fourier Transform for Foetal Electrocardiogram Extraction

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-2312 ◽

2021 ◽

pp. 84-91

Author(s):

Prof. M. Senthil Vadivu ◽

Saranya H ◽

Vijay Kumar K S

Keyword(s):

Fourier Transform ◽

Wave Structure ◽

Well Being ◽

Short Term ◽

Morphological Examination ◽

Time Frequency ◽

Non Invasive ◽

Frequency Representation ◽

Short Time

The objective of the project is to improve maternal abdomen recording for better prediction of foetal Electrocardiogram (FECG). One of the most difficult tasks in observing foetal well-being is obtaining a clean foetal Electrocardiogram (FECG) using non-invasive abdominal recordings. The foetal graph's low signal quality, on the other hand, makes morphological examination of its wave structure in clinical follow-up difficult. The signal contains precise information that can help doctors to monitor fetal health during pregnancy and labor. The abdominal signal is normalized and separated in the pre-processing stage for wave shape analysis in clinical follow-up. The Kaiser window is used for spectral analysis and segmenting the signal. The two-dimensional (2D) time-frequency representation is obtained by short-time Fourier transform (STFT). The STFT enhances the abdominal recordings of maternal Electrocardiogram (MECG) for efficient separation of foetal electrocardiogram (FECG) to monitor the foetus well-being.

Download Full-text

Sampled and discretized of short-time Fourier transform and non-negative matrix factorization: the single-channel source separation case

Jurnal Teknologi dan Sistem Komputer ◽

10.14710/jtsiskom.2020.13858 ◽

2020 ◽

Vol 9 (1) ◽

pp. 41-48

Author(s):

Jans Hendry ◽

Isnan Nur Rifai ◽

Yoga Mileniandi

Keyword(s):

Fourier Transform ◽

Matrix Factorization ◽

Single Channel ◽

Source Separation ◽

Short Time Fourier Transform ◽

Time Frequency ◽

Frequency Representation ◽

Short Time ◽

Better Than ◽

Non Negative Matrix Factorization

The Short-time Fourier transform (STFT) is a popular time-frequency representation in many source separation problems. In this work, the sampled and discretized version of Discrete Gabor Transform (DGT) is proposed to replace STFT within the single-channel source separation problem of the Non-negative Matrix Factorization (NMF) framework. The result shows that NMF-DGT is better than NMF-STFT according to Signal-to-Interference Ratio (SIR), Signal-to-Artifact Ratio (SAR), and Signal-to-Distortion Ratio (SDR). In the supervised scheme, NMF-DGT has a SIR of 18.60 dB compared to 16.24 dB in NMF-STFT, SAR of 13.77 dB to 13.69 dB, and SDR of 12.45 dB to 11.16 dB. In the unsupervised scheme, NMF-DGT has a SIR of 0.40 dB compared to 0.27 dB by NMF-STFT, SAR of -10.21 dB to -10.36 dB, and SDR of -15.01 dB to -15.23 dB.

Download Full-text