scholarly journals Classification and Separation of Audio and Music Signals

2020 ◽  
Author(s):  
Abdullah I. Al-Shoshan

This chapter addresses the topic of classification and separation of audio and music signals. It is a very important and a challenging research area. The importance of classification process of a stream of sounds come up for the sake of building two different libraries: speech library and music library. However, the separation process is needed sometimes in a cocktail-party problem to separate speech from music and remove the undesired one. In this chapter, some existed algorithms for the classification process and the separation process are presented and discussed thoroughly. The classification algorithms will be divided into three categories. The first category includes most of the real time approaches. The second category includes most of the frequency domain approaches. However, the third category introduces some of the approaches in the time-frequency distribution. The approaches of time domain discussed in this chapter are the short-time energy (STE), the zero-crossing rate (ZCR), modified version of the ZCR and the STE with positive derivative, the neural networks, and the roll-off variance. The approaches of the frequency spectrum are specifically the roll-off of the spectrum, the spectral centroid and the variance of the spectral centroid, the spectral flux and the variance of the spectral flux, the cepstral residual, and the delta pitch. The time-frequency domain approaches have not been yet tested thoroughly in the process of classification and separation of audio and music signals. Therefore, the spectrogram and the evolutionary spectrum will be introduced and discussed. In addition, some algorithms for separation and segregation of music and audio signals, like the independent Component Analysis, the pitch cancelation and the artificial neural networks will be introduced.

2004 ◽  
Vol 213 ◽  
pp. 483-486
Author(s):  
David Brodrick ◽  
Douglas Taylor ◽  
Joachim Diederich

A recurrent neural network was trained to detect the time-frequency domain signature of narrowband radio signals against a background of astronomical noise. The objective was to investigate the use of recurrent networks for signal detection in the Search for Extra-Terrestrial Intelligence, though the problem is closely analogous to the detection of some classes of Radio Frequency Interference in radio astronomy.


2020 ◽  
Vol 10 (11) ◽  
pp. 2764-2767
Author(s):  
Chuanbin Ge ◽  
Di Liu ◽  
Juan Liu ◽  
Bingshuai Liu ◽  
Yi Xin

Arrhythmia is a group of conditions in which the heartbeat is irregular. There are many types of arrhythmia. Some can be life-threatening. Electrocardiogram (ECG) is an effective clinical tool used to diagnosis arrhythmia. Automatic recognition of different arrhythmia types in ECG signals has become an important and challenging issue. In this article, we proposed an algorithm to detect arrhythmia in 12-lead ECG signals and classify signals into 9 categories. Two 19-layer deep neural networks combining convolutional neural network and gated recurrent unit were proposed to realize this work. The first one was trained directly with the raw 12-lead ECG data while the other one was trained with an 18-"lead" ECG data, where the six extra leads containing morphology information in fractional time–frequency domain were generated utilizing fractional Fourier transform (FRFT). Overall detection results were obtained by fusing the output of these two networks and the final classification results on the testing dataset reports our proposed algorithm obtained a F1 score of 0.855. Furthermore, with our proposed algorithm, a better F1 score 0.81 was attained using training dataset provided by the China Physiological Signal Challenge held in 2018.


Author(s):  
Wentao Xie ◽  
Qian Zhang ◽  
Jin Zhang

Smart eyewear (e.g., AR glasses) is considered to be the next big breakthrough for wearable devices. The interaction of state-of-the-art smart eyewear mostly relies on the touchpad which is obtrusive and not user-friendly. In this work, we propose a novel acoustic-based upper facial action (UFA) recognition system that serves as a hands-free interaction mechanism for smart eyewear. The proposed system is a glass-mounted acoustic sensing system with several pairs of commercial speakers and microphones to sense UFAs. There are two main challenges in designing the system. The first challenge is that the system is in a severe multipath environment and the received signal could have large attenuation due to the frequency-selective fading which will degrade the system's performance. To overcome this challenge, we design an Orthogonal Frequency Division Multiplexing (OFDM)-based channel state information (CSI) estimation scheme that is able to measure the phase changes caused by a facial action while mitigating the frequency-selective fading. The second challenge is that because the skin deformation caused by a facial action is tiny, the received signal has very small variations. Thus, it is hard to derive useful information directly from the received signal. To resolve this challenge, we apply a time-frequency analysis to derive the time-frequency domain signal from the CSI. We show that the derived time-frequency domain signal contains distinct patterns for different UFAs. Furthermore, we design a Convolutional Neural Network (CNN) to extract high-level features from the time-frequency patterns and classify the features into six UFAs, namely, cheek-raiser, brow-raiser, brow-lower, wink, blink and neutral. We evaluate the performance of our system through experiments on data collected from 26 subjects. The experimental result shows that our system can recognize the six UFAs with an average F1-score of 0.92.


Sign in / Sign up

Export Citation Format

Share Document