Multi-microphone speech enhancement informed by auditory scene analysis

Author(s):  
Axel Plinge ◽  
Sharon Gannot
2014 ◽  
Vol 614 ◽  
pp. 363-366
Author(s):  
Yi Jiang ◽  
Yuan Yuan Zu ◽  
Ying Ze Wang

A K-means based unsupervised approach to close-talk speech enhancement is proposed in this paper. With the frame work of computational auditory scene analysis (CASA), the dual-microphone energy difference (DMED) is used as the cue to classify the noise domain time-frequency (T-F) units and target speech domain units. A ratio mask is used to separate the target speech and noise. Experiment results show the robust performance of the proposed algorithm than the Wiener filtering algorithm.


2014 ◽  
Vol 78 (3) ◽  
pp. 361-378 ◽  
Author(s):  
Mona Isabel Spielmann ◽  
Erich Schröger ◽  
Sonja A. Kotz ◽  
Alexandra Bendixen

Author(s):  
Meghan Goodchild ◽  
Stephen McAdams

The study of timbre and orchestration in music research is underdeveloped, with few theories to explain instrumental combinations and orchestral shaping. This chapter will outline connections between the orchestration practices of the nineteenth and early twentieth centuries and perceptual principles based on recent research in auditory scene analysis and timbre perception. Analyses of orchestration treatises and musical scores reveal an implicit understanding of auditory grouping principles by which many orchestral effects and techniques function. We will explore how concurrent grouping cues result in blended combinations of instruments, how sequential grouping into segregated melodies or stratified (foreground and background) layers is influenced by timbral similarities and dissimilarities, and how segmental grouping cues create formal boundaries and expressive gestural shaping through changes in instrumental textures. This exploration will be framed within an examination of historical and contemporary discussion of orchestral effects and techniques.


2017 ◽  
Vol 23 (2) ◽  
Author(s):  
Anna Zayaruznaya

The medieval composers of polytextual motets have been charged with rendering multiple texts inaudible by superimposing them. While the limited contemporary evidence provided by Jacobus’s comments in theSpeculum musicaeseems at first sight to suggest that medieval listeners would have had trouble understanding texts declaimed simultaneously, closer scrutiny reveals the opposite: that intelligibility was desirable, and linked to modes of performance. This article explores the ways in which 20th-century performance aesthetics and recording technologies have shaped current ideas about the polytextual motet. Recent studies in cognitive psychology suggest that human ability to perform auditory scene analysis—to focus on a given sound in a complicated auditory environment—is enhanced by directional listening and relatively dry acoustics. But the modern listener often encounters motets on recordings with heavy mixing and reverb. Furthermore, combinations of contrasting vocal timbres, which can help differentiate simultaneously sung texts, are precluded by a blended, uniform sound born jointly of English choir-school culture and modernist preferences propagated under the banner of authenticity. Scholarly accounts of motets that focus on sound over sense are often influenced, directly or indirectly, by such mediated listening.


Sign in / Sign up

Export Citation Format

Share Document