Deep learning-based stereophonic acoustic echo suppression without decorrelation

<div><div><div><p>Quality degradation of near-end speech in mobile communication or hands free devices is mainly due to acoustic echoes and background noises. The received far-end speech gets reflected from the obstacles present in the surroundings creating acoustic echo. All other disturbances from the near-end environment are considered as background noises. A novel acoustic echo suppression scheme using speech uncertainty in modulation domain (MD) is proposed in this paper. State of the art acoustic echo suppression systems are based on either time domain or frequency domain analysis. In recent times, the modulation domain analysis is popularly used in speech processing, as it captures the human perceptual properties. Modulation domain provides the temporal variation of the acoustic magnitude spectra which acts as an information bearing signal. In this paper, a new method is developed and implemented to model the echo path and estimate the echo in modulation domain. Echo cancellation is done effectively by manipulating the modulation spectrum and employing speech uncertainty. In this method, the microphone input is modelled as a binary hypothesis process and the gain function is modified accordingly. The proposed method shows better performance as compared to other competitive methods for acoustic echo suppression with no audible degradation in the near-end speech. <br></p></div></div></div>

Download Full-text

Nonlinear residual echo suppression based on dual-stream DPRNN

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00221-8 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Hongsheng Chen ◽

Guoliang Chen ◽

Kai Chen ◽

Jing Lu

Keyword(s):

Adaptive Filters ◽

Fine Tuning ◽

Speech Communication ◽

Time Frequency ◽

Acoustic Echo ◽

Processing Module ◽

Auxiliary Signal ◽

Echo Suppression ◽

The Time Domain ◽

The Impact

AbstractThe acoustic echo cannot be entirely removed by linear adaptive filters due to the nonlinear relationship between the echo and the far-end signal. Usually, a post-processing module is required to further suppress the echo. In this paper, we propose a residual echo suppression method based on the modification of dual-path recurrent neural network (DPRNN) to improve the quality of speech communication. Both the residual signal and the auxiliary signal, the far-end signal or the output of the adaptive filter, obtained from the linear acoustic echo cancelation are adopted to form a dual-stream for the DPRNN. We validate the efficacy of the proposed method in the notoriously difficult double-talk situations and discuss the impact of different auxiliary signals on performance. We also compare the performance of the time domain and the time-frequency domain processing. Furthermore, we propose an efficient and applicable way to deploy our method to off-the-shelf loudspeakers by fine-tuning the pre-trained model with little recorded-echo data.

Download Full-text