scholarly journals Co-occurrence Based Approach for Differentiation of Speech and Song

Author(s):  
Arijit Ghosal ◽  
Ranjit Ghoshal

Discrimination of speech and song through auditory signal is an exciting topic of research. Preceding efforts were mainly discrimination of speech and non-speech but moderately fewer efforts were carried out to discriminate speech and song. Discrimination of speech and song is one of the noteworthy fragments of automatic sorting of audio signal because this is considered to be the fundamental step of hierarchical approach towards genre identification, audio archive generation. The previous efforts which were carried out to discriminate speech and song, have involved frequency domain and perceptual domain aural features. This work aims to propose an acoustic feature which is small dimensional as well as easy to compute. It is observed that energy level of speech signal and song signal differs largely due to absence of instrumental part as a background in case of speech signal. Short Time Energy (STE) is the best acoustic feature which can echo this scenario. For precise study of energy variation co-occurrence matrix of STE is generated and statistical features are extracted from it. For classification resolution, some well-known supervised classifiers have been engaged in this effort. Performance of proposed feature set has been compared with other efforts to mark the supremacy of the feature set.

2018 ◽  
Vol 21 (3) ◽  
pp. 571-579 ◽  
Author(s):  
Leena Mary ◽  
Anil P. Antony ◽  
Ben P. Babu ◽  
S. R. Mahadeva Prasanna

2021 ◽  
Vol 3 (1) ◽  
pp. 11-14
Author(s):  
Arda Şahin ◽  
Mehmet Zübeyir Ünlü

The main objective of this study is to have noise component of a speech signal eliminated and compressing it by storing the locations and durations of silence regions. The separation between voiced, unvoiced, and silence regions are done by using the Short Time Energy (STE) and Zero Crossing Rate (ZCR) methodologies. All operations in this study have been performed by using the User Interface (UI) developed on MATLAB®. These operations include voice recording, playing the recording, eliminating the unwanted regions, playing the modified recording, saving of original and compressed files and loading the recording compressed.


2021 ◽  
Vol 11 (14) ◽  
pp. 6288
Author(s):  
Hang Su ◽  
Chang-Myung Lee

The generalized sidelobe canceller (GSC) method is a common algorithm to enhance audio signals using a microphone array. Distortion of the enhanced audio signal consists of two parts: the residual acoustic noise and the distortion of the desired audio signal, which means that the desired audio signal is damaged. This paper proposes a modified GSC method to reduce both kinds of distortion when the desired audio signal is a non-stationary speech signal. First, the cross-correlation coefficient between the canceling signal and the error signal of the least mean square (LMS) algorithm was added to the adaptive process of the GSC method to reduce the distortion of the enhanced signal while the energy of the desired signal frame was increased suddenly. The sidelobe pattern of beamforming was then presented to estimate the noise signal in the beamforming output signal of the GSC method. The noise component of the beamforming output signal was decreased by subtracting the estimated noise signal to improve the denoising performance of the GSC method. Finally, the GSC-SN-MCC method was proposed by merging the above two methods. The experiment was performed in an anechoic chamber to validate the proposed method in various SNR conditions. Furthermore, the simulated calculation with inaccurate noise directions was conducted based on the experiment data to inspect the robustness of the proposed method to the error of the estimated noise direction. The experiment data and calculation results indicated that the proposed method could reduce the distortion effectively under various SNR conditions and would not cause more distortion if the estimated noise direction is far from the actual noise direction.


1973 ◽  
Vol 28 (1) ◽  
pp. 105-109 ◽  
Author(s):  
H. Jäger ◽  
R. Schöfer

For shock waves produced by special wire explosions the short time energy input condition of the theories of Lin, Sakurai and Vlases-Jones is fairly good fulfilled. In these cases the shock wave energies can be easily determined from the expansion velocity of the waves. Variation of the parameters of the discharge circuit show, how these parameters should be chosen in order to get a maximum transfer of energy either to the shock waves or to the wire material.


2013 ◽  
Vol 284-287 ◽  
pp. 2867-2871 ◽  
Author(s):  
Jui Feng Yeh ◽  
Min Da Kuo ◽  
Zhong Hua Hsu

Packet loss is one of the most essential problems in speech communication. It will cause the information loss and uncomfortable for listeners in voice over IP. This investment proposed an approach based on waveform similarity measure using overlap-and-Add algorithm. The waveform similarity overlap-and-add (WSOLA) technique is an effective algorithm to deal with packet loss concealment (PLC). For real-time time communication, the WSOLA algorithm is widely used to deal with the length adaptation and packet loss concealment of speech signal. Time scale modification of audio signal is one of the most essential research topics in data communication, especially in voice of IP (VoIP). Herein, we proposed the dual-side WSOLA that is derived by standard WSOLA. Instead of only exploitation one direction speech data, the proposed method will reconstruct the lost voice data according to the preceding and cascading voice. The dual-side WSOLA can use both the past and future speech signal waveform to reconstruction voice waveform of lost packet. The evaluations show that the quality of the reconstructed speech signal of the dual-side WSOLA is higher than that of the standard WSOLA and GWSOLA on different packet loss rate and length using the metrics: PESQ and MOS. The significant improvement is obtained by dual side information in the proposed method. The proposed dual-side waveform similarity overlap-and-add (DSWSOLA) outperforms the traditional approaches.


1977 ◽  
Vol 29 (1) ◽  
pp. 135-146 ◽  
Author(s):  
D. W. Green

Two independent groups of subjects, under instruction orienting them towards understanding or towards memorizing sentences were timed to respond to a brief auditory signal which occurred at some point during the course of a sentence. Latency appeared to be primarily a function of the task, such that the deeper the semantic processing of the sentence the longer the reaction time. Together with other aspects of the data, it is argued that such tasks affect the extent to which a subject retrieves the meanings of the words in a sentence and integrates them at the end of it. Concrete and abstract sentences were processed in fundamentally the same way. The conclusion drawn is that speech comprehension is an integrative process, under voluntary control, which collates together different aspects of the speech signal.


2019 ◽  
Author(s):  
Oliver Liebfried ◽  
Volker Brommer ◽  
Harald Scharf ◽  
Matthias Schacherer ◽  
Paul Frings

<div>Poster contribution to the 26th International Conference on Magnet Technology (MT26) in Vancouver, Canada, September 22-27, 2019. paper was submitted to the MT26 special issue of the IEEE Transactions on Applied Superconductivity.</div><div><br></div><div>Abstract: Inductive pulsed power generators apply coils as<br>powerful short time energy storage which is an ordinary mean to deliver pulses of high power to loads like electromagnetic accelerators. This article deals with the design, simulation, construction, electrical characterization and a pulsed stress test of a modular toroidal coil. The coil was made from 180 D-shaped copper discs and has an approximate inductance of 1mH (f > 50 Hz) and frequency dependent resistance according to 3.88 mOhm Sqrt(f) + 5 mOhm. Its height, diameter and weight is 0.4 m, 1 m and 1 ton respectively. It is designed to store more than 1 MJ<br>of energy.<br></div>


Author(s):  
Huyen Vu ◽  
Trygve Eftestøl ◽  
Kjersti Engan ◽  
Joar Eilevstjønn ◽  
Ladislaus Blacy Yarrot ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document