A Method of Speech Signal Analysis Using Multi-level Wavelet Transform

Background: In this paper, we propose a secure image watermarking technique which is applied to grayscale and color images. It consists in applying the SVD (Singular Value Decomposition) in the Lifting Wavelet Transform domain for embedding a speech image (the watermark) into the host image. Methods: It also uses signature in the embedding and extraction steps. Its performance is justified by the computation of PSNR (Pick Signal to Noise Ratio), SSIM (Structural Similarity), SNR (Signal to Noise Ratio), SegSNR (Segmental SNR) and PESQ (Perceptual Evaluation Speech Quality). Results: The PSNR and SSIM are used for evaluating the perceptual quality of the watermarked image compared to the original image. The SNR, SegSNR and PESQ are used for evaluating the perceptual quality of the reconstructed or extracted speech signal compared to the original speech signal. Conclusion: The Results obtained from computation of PSNR, SSIM, SNR, SegSNR and PESQ show the performance of the proposed technique.

Download Full-text

Automated speech signal analysis based on feature extraction and classification of spasmodic dysphonia: a performance comparison of different classifiers

International Journal of Speech Technology ◽

10.1007/s10772-017-9471-8 ◽

2017 ◽

Vol 21 (1) ◽

pp. 9-18 ◽

Cited By ~ 2

Author(s):

Snekhalatha Umapathy ◽

Shamila Rachel ◽

Rajalakshmi Thulasi

Keyword(s):

Feature Extraction ◽

Speech Signal ◽

Signal Analysis ◽

Performance Comparison ◽

Spasmodic Dysphonia ◽

A Performance

Download Full-text

Formant structure of the voice during the intensive acute hypoxia

Vojnosanitetski pregled ◽

10.2298/vsp0302155o ◽

2003 ◽

Vol 60 (2) ◽

pp. 155-159 ◽

Cited By ~ 2

Author(s):

Jovisa Obrenovic ◽

Milkica Nesic ◽

Vladimir Nesic ◽

Snezana Cekic

Keyword(s):

Speech Signal ◽

Signal Analysis ◽

Acute Hypoxia ◽

Initial Period ◽

Formant Frequencies ◽

Hypoxia Exposure ◽

Structure Changes ◽

The Voice ◽

Reversed Order ◽

Different Altitudes

The influence of intensive acute hypoxia on the frequency-amplitude formant vocal O characteristics was investigated in this study. Examinees were exposed to the simulated altitudes of 5 500 m and 6 700 m in climabaro chamber and resolved Lotig?s test in the conditions of normoxia, i.e. pronounced the three-digit numbers beginning from 900, but in reversed order. Frequency and intensity values of vocal O (F1, F2, F3 and F4) extracted from the context of the pronunciation of the word eight (osam in Serbian), were measured by spectral speech signal analysis. Changes in frequency values and the intensity of the formants were examined. The obtained results showed that there were no significant changes of the formant frequencies in hypoxia condition compared to normoxia. Though significant changes of formant?s intensities were found compared to normoxia on the cited altitudes. The rise of formants intensities was found at the altitude of 5 500 m. Hypoxia at the altitude of 6 700 m caused the significant fall of the intensities in the initial period, compared to normoxia. The prolonged hypoxia exposure caused the rise of the formant intensities compared to the altitude of 5 500 m. In may be concluded that due to different altitudes, hypoxia causes different effects on the formants structure changes, compared to normoxia.

Download Full-text

Design and Implementation of Butterworth, Chebyshev-I and Elliptic Filter for Speech Signal Analysis

International Journal of Computer Applications ◽

10.5120/17195-7390 ◽

2014 ◽

Vol 98 (7) ◽

pp. 12-18 ◽

Cited By ~ 9

Author(s):

Prajoy Podder ◽

Md. Mehedi Hasan ◽

Md. Rafiqul Islam ◽

Mursalin Sayeed

Keyword(s):

Speech Signal ◽

Signal Analysis ◽

Elliptic Filter ◽

Design And Implementation

Download Full-text

A Hybrid Approach to Gender Classification using Speech Signal

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset196110 ◽

2019 ◽

pp. 17-24

Author(s):

M. Yasin Pir ◽

Mohamad Idris Wani

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Wavelet Transform ◽

Speech Signal ◽

Hybrid Approach ◽

Gender Classification ◽

Power Spectral ◽

Reconstructed Signal ◽

Feed Forward Network ◽

Artificial Neural

Speech forms a significant means of communication and the variation in pitch of a speech signal of a gender is commonly used to classify gender as male or female. In this study, we propose a system for gender classification from speech by combining hybrid model of 1-D Stationary Wavelet Transform (SWT) and artificial neural network. Features such as power spectral density, frequency, and amplitude of human voice samples were used to classify the gender. We use Daubechies wavelet transform at different levels for decomposition and reconstruction of the signal. The reconstructed signal is fed to artificial neural network using feed forward network for classification of gender. This study uses 400 voice samples of both the genders from Michigan University database which has been sampled at 16000 Hz. The experimental results show that the proposed method has more than 94% classification efficiency for both training and testing datasets.

Download Full-text

Detection of hypernasality from speech signal using group delay and wavelet transform

2016 6th International Conference on Computer and Knowledge Engineering (ICCKE) ◽

10.1109/iccke.2016.7802138 ◽

2016 ◽

Cited By ~ 2

Author(s):

Atefeh Mirzaei ◽

Mansour Vali

Keyword(s):

Wavelet Transform ◽

Speech Signal ◽

Group Delay

Download Full-text

Multi-channel mono-path periodic signal extraction with global amplitude and phase modulation for music and speech signal analysis

IEEE/SP 13th Workshop on Statistical Signal Processing, 2005 ◽

10.1109/ssp.2005.1628568 ◽

2005 ◽

Cited By ~ 1

Author(s):

M. Triki ◽

D.T.M. Slock

Keyword(s):

Speech Signal ◽

Phase Modulation ◽

Signal Analysis ◽

Signal Extraction ◽

Periodic Signal

Download Full-text

Open Quotient Measurements Based on Multiscale Product of Speech Signal Wavelet Transform

Research Letters in Signal Processing ◽

10.1155/2007/62521 ◽

2007 ◽

Vol 2007 ◽

pp. 1-5 ◽

Cited By ~ 15

Author(s):

Aïcha Bouzid ◽

Noureddine Ellouze

Keyword(s):

Wavelet Transform ◽

Relative Error ◽

Speech Processing ◽

Speech Signal ◽

Accurate Estimation ◽

Pitch Period ◽

Open Time ◽

Wide Range ◽

Voiced Speech ◽

Egg Signal

This paper describes a multiscale product method (MPM) for open quotient measure in voiced speech. The method is based on determining the glottal closing and opening instants. The proposed approach consists of making the products of wavelet transform of speech signal at different scales in order to enhance the edge detection and parameter estimation. We show that the proposed method is effective and robust for detecting speech singularity. Accurate estimation of glottal closing instants (GCIs) and opening instants (GOIs) is important in a wide range of speech processing tasks. In this paper, accurate estimation of GCIs and GOIs is used to measure the local open quotient (Oq) which is the ratio of the open time by the pitch period. Multiscale product operates automatically on speech signal; the reference electroglottogram (EGG) signal is used for performance evaluation. The ratio of good GCI detection is 95.5% and that of GOI is 76%. The pitch period relative error is 2.6% and the open phase relative error is 5.6%. The relative error measured on open quotient reaches 3% for the whole Keele database.

Download Full-text