Efficient real-time noise estimation without explicit speech, non-speech detection: an assessment on the AURORA corpus

Author(s):  
N.W.D. Evans ◽  
J.S. Mason ◽  
B. Fauve
2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Shan-Shan Li ◽  
Jian Zhou ◽  
Xuan Wang

Aiming at the shortcomings of traditional broadcast transmitter noise test methods, such as low efficiency, inconvenient data storage, and high requirements for testers, a dynamic online test method for transmitter noise is proposed. The principle of system composition and test method is given. The transmitter noise is real-time changing. The Voice Active Detection (VAD) noise estimation algorithm cannot track the transmitter noise change in real time. This paper proposes a combined noise estimation algorithm for VAD and dynamic estimation. By setting the threshold of the double-threshold VAD detection to be low, it can accurately detect the silent segment. The silent segment is used as a noise signal for noise estimation. For the nonsilent segment detected by the VAD, a minimum value search dynamic spectrum estimation algorithm based on the existence probability of the speech (IMCRA) is used for noise estimation. Transmitter noise is measured by calculating the noise figure (NF).The test method collects the input and output data of the transmitter in real time, which has better accuracy and real-time performance, and the feasibility of the method is verified by experimental simulation.


2019 ◽  
Author(s):  
Chi-Te Wang ◽  
Ji-Yan Han ◽  
Shih-Hau Fang ◽  
Ying-Hui Lai

BACKGROUND Voice disorders mainly result from chronic overuse or abuse, particularly in occupational voice users such as teachers. Previous studies proposed a contact microphone attached to the anterior neck for ambulatory voice monitoring; however, the inconvenience associated with taping and wiring, along with the lack of real-time processing, has limited its clinical application. OBJECTIVE This study aims to (1) propose an automatic speech detection system using wireless microphones for real-time ambulatory voice monitoring, (2) examine the detection accuracy under controlled environment and noisy conditions, and (3) report the results of the phonation ratio in practical scenarios. METHODS We designed an adaptive threshold function to detect the presence of speech based on the energy envelope. We invited 10 teachers to participate in this study and tested the performance of the proposed automatic speech detection system regarding detection accuracy and phonation ratio. Moreover, we investigated whether the unsupervised noise reduction algorithm (ie, log minimum mean square error) can overcome the influence of environmental noise in the proposed system. RESULTS The proposed system exhibited an average accuracy of speech detection of 89.9%, ranging from 81.0% (67,357/83,157 frames) to 95.0% (199,201/209,685 frames). Subsequent analyses revealed a phonation ratio between 44.0% (33,019/75,044 frames) and 78.0% (68,785/88,186 frames) during teaching sessions of 40-60 minutes; the durations of most of the phonation segments were less than 10 seconds. The presence of background noise reduced the accuracy of the automatic speech detection system, and an adjuvant noise reduction function could effectively improve the accuracy, especially under stable noise conditions. CONCLUSIONS This study demonstrated an average detection accuracy of 89.9% in the proposed automatic speech detection system with wireless microphones. The preliminary results for the phonation ratio were comparable to those of previous studies. Although the wireless microphones are susceptible to background noise, an additional noise reduction function can alleviate this limitation. These results indicate that the proposed system can be applied for ambulatory voice monitoring in occupational voice users.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Emilio Andreozzi ◽  
Antonio Fratini ◽  
Daniele Esposito ◽  
Mario Cesarelli ◽  
Paolo Bifulco

Abstract Background Low-dose X-ray images have become increasingly popular in the last decades, due to the need to guarantee the lowest reasonable patient’s exposure. Dose reduction causes a substantial increase of quantum noise, which needs to be suitably suppressed. In particular, real-time denoising is required to support common interventional fluoroscopy procedures. The knowledge of noise statistics provides precious information that helps to improve denoising performances, thus making noise estimation a crucial task for effective denoising strategies. Noise statistics depend on different factors, but are mainly influenced by the X-ray tube settings, which may vary even within the same procedure. This complicates real-time denoising, because noise estimation should be repeated after any changes in tube settings, which would be hardly feasible in practice. This work investigates the feasibility of an a priori characterization of noise for a single fluoroscopic device, which would obviate the need for inferring noise statics prior to each new images acquisition. The noise estimation algorithm used in this study was tested in silico to assess its accuracy and reliability. Then, real sequences were acquired by imaging two different X-ray phantoms via a commercial fluoroscopic device at various X-ray tube settings. Finally, noise estimation was performed to assess the matching of noise statistics inferred from two different sequences, acquired independently in the same operating conditions. Results The noise estimation algorithm proved capable of retrieving noise statistics, regardless of the particular imaged scene, also achieving good results even by using only 10 frames (mean percentage error lower than 2%). The tests performed on the real fluoroscopic sequences confirmed that the estimated noise statistics are independent of the particular informational content of the scene from which they have been inferred, as they turned out to be consistent in sequences of the two different phantoms acquired independently with the same X-ray tube settings. Conclusions The encouraging results suggest that an a priori characterization of noise for a single fluoroscopic device is feasible and could improve the actual implementation of real-time denoising strategies that take advantage of noise statistics to improve the trade-off between noise reduction and details preservation.


Author(s):  
Hongbing Zhang

In recent years, in the context of the rapid development of information technology, artificial intelligence has also developed. People have begun to train machines. Many machines have been able to gradually understand human languagesand perform a series of actions based on language instructions. On this basis, scientific researchers hope that the machine can be more intelligent and humane. In the noise estimation stage, a noise estimation algorithm based on speech detection is used to effectively estimate the noise. Secondly, according to the characteristics of the method of speech noise reduction processing, a method of processing speech noise is realized. Finally, simulation experiments are used to illustrate the effectiveness of the algorithm. Aiming at the shortcomings of traditional speech noise reduction algorithms, improvements were made in adaptive filter estimation. The model's speech noise reduction algorithm was obtained. The cepstrum estimation of speech signals was modified, and the effect of speech enhancement was significantly improved.


Sign in / Sign up

Export Citation Format

Share Document