scholarly journals Speaker-Independent Speech Enhancement with Brain Signals

Author(s):  
Maryam Hosseini ◽  
Luca Celotti ◽  
Eric Plourde

Single-channel speech enhancement algorithms have seen great improvements over the past few years. Despite these improvements, they still lack the efficiency of the auditory system in extracting attended auditory information in the presence of competing speakers. Recently, it has been shown that the attended auditory information can be decoded from the brain activity of the listener. In this paper, we propose two novel deep learning methods referred to as the Brain Enhanced Speech Denoiser (BESD) and the U-shaped Brain Enhanced Speech Denoiser (U-BESD) respectively, that take advantage of this fact to denoise a multi-talker speech mixture. We use a Feature-wise Linear Modulation (FiLM) between the brain activity and the sound mixture, to better extract the features of the attended speaker to perform speech enhancement. We show, using electroencephalography (EEG) signals recorded from the listener, that U-BESD outperforms a current autoencoder approach in enhancing a speech mixture as well as a speech separation approach that uses brain activity. Moreover, we show that both BESD and U-BESD successfully extract the attended speaker without any prior information about this speaker. This makes both algorithms great candidates for realistic applications where no prior information about the attended speaker is available, such as hearing aids, cellphones, or noise cancelling headphones. All procedures were performed in accordance with the Declaration of Helsinki and were approved by the Ethics Committees of the School of Psychology at Trinity College Dublin, and the Health Sciences Faculty at Trinity College Dublin.

2021 ◽  
Author(s):  
Maryam Hosseini ◽  
Luca Celotti ◽  
Eric Plourde

Single-channel speech enhancement algorithms have seen great improvements over the past few years. Despite these improvements, they still lack the efficiency of the auditory system in extracting attended auditory information in the presence of competing speakers. Recently, it has been shown that the attended auditory information can be decoded from the brain activity of the listener. In this paper, we propose two novel deep learning methods referred to as the Brain Enhanced Speech Denoiser (BESD) and the U-shaped Brain Enhanced Speech Denoiser (U-BESD) respectively, that take advantage of this fact to denoise a multi-talker speech mixture. We use a Feature-wise Linear Modulation (FiLM) between the brain activity and the sound mixture, to better extract the features of the attended speaker to perform speech enhancement. We show, using electroencephalography (EEG) signals recorded from the listener, that U-BESD outperforms a current autoencoder approach in enhancing a speech mixture as well as a speech separation approach that uses brain activity. Moreover, we show that both BESD and U-BESD successfully extract the attended speaker without any prior information about this speaker. This makes both algorithms great candidates for realistic applications where no prior information about the attended speaker is available, such as hearing aids, cellphones, or noise cancelling headphones. All procedures were performed in accordance with the Declaration of Helsinki and were approved by the Ethics Committees of the School of Psychology at Trinity College Dublin, and the Health Sciences Faculty at Trinity College Dublin.


2019 ◽  
Author(s):  
Mathieu Bourguignon ◽  
Nicola Molinaro ◽  
Mikel Lizarazu ◽  
Samu Taulu ◽  
Veikko Jousmäki ◽  
...  

AbstractTo gain novel insights into how the human brain processes self-produced auditory information during reading aloud, we investigated the coupling between neuromagnetic activity and the temporal envelope of the heard speech sounds (i.e., speech brain tracking) in a group of adults who 1) read a text aloud, 2) listened to a recording of their own speech (i.e., playback), and 3) listened to another speech recording. Coherence analyses revealed that, during reading aloud, the reader’s brain tracked the slow temporal fluctuations of the speech output. Specifically, auditory cortices tracked phrasal structure (<1 Hz) but to a lesser extent than during the two speech listening conditions. Also, the tracking of syllable structure (4–8 Hz) occurred at parietal opercula during reading aloud and at auditory cortices during listening. Directionality analyses based on renormalized partial directed coherence revealed that speech brain tracking at <1 Hz and 4–8 Hz is dominated by speech-to-brain directional coupling during both reading aloud and listening, meaning that speech brain tracking mainly entails auditory feedback processing. Nevertheless, brain-to-speech directional coupling at 4– 8 Hz was enhanced during reading aloud compared with listening, likely reflecting speech monitoring before production. Altogether, these data bring novel insights into how auditory verbal information is tracked by the human brain during perception and self-generation of connected speech.HighlightsThe brain tracks phrasal and syllabic rhythmicity of self-produced (read) speech.Tracking of phrasal structures is attenuated during reading compared with listening.Speech rhythmicity mainly drives brain activity during reading and listening.Brain activity drives syllabic rhythmicity more during reading than listening.


Electronics ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. 1698
Author(s):  
Iordanis Thoidis ◽  
Lazaros Vrysis ◽  
Dimitrios Markou ◽  
George Papanikolaou

Perceptually motivated audio signal processing and feature extraction have played a key role in the determination of high-level semantic processes and the development of emerging systems and applications, such as mobile phone telecommunication and hearing aids. In the era of deep learning, speech enhancement methods based on neural networks have seen great success, mainly operating on the log-power spectra. Although these approaches surpass the need for exhaustive feature extraction and selection, it is still unclear whether they target the important sound characteristics related to speech perception. In this study, we propose a novel set of auditory-motivated features for single-channel speech enhancement by fusing temporal envelope and temporal fine structure information in the context of vocoder-like processing. A causal gated recurrent unit (GRU) neural network is employed to recover the low-frequency amplitude modulations of speech. Experimental results indicate that the exploited system achieves considerable gains for normal-hearing and hearing-impaired listeners, in terms of objective intelligibility and quality metrics. The proposed auditory-motivated feature set achieved better objective intelligibility results compared to the conventional log-magnitude spectrogram features, while mixed results were observed for simulated listeners with hearing loss. Finally, we demonstrate that the proposed analysis/synthesis framework provides satisfactory reconstruction accuracy of speech signals.


2021 ◽  
Vol 150 (3) ◽  
pp. 1663-1673
Author(s):  
Nikhil Shankar ◽  
Gautam Shreedhar Bhat ◽  
Issa M. S. Panahi ◽  
Stephanie Tittle ◽  
Linda M. Thibodeau

Animals ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 417
Author(s):  
Jesús Jaime Moreno Escobar ◽  
Oswaldo Morales Matamoros ◽  
Erika Yolanda Aguilar del Villar ◽  
Ricardo Tejeida Padilla ◽  
Ixchel Lina Reyes ◽  
...  

Dolphin-Assisted Therapies (DAT) are alternative therapies aimed to reduce anxiety levels, stress relief and physical benefits. This paper is focused on measuring and analyzing dolphins brain activity when DAT is taking place in order to identify if there is any differences in female dolphin’s neuronal signal when it is interacting with control or intervention subjects, performing our research in Delfiniti, Ixtapa, Mexico facilities. We designed a wireless and portable electroencephalographic single-channel signal capture sensor to acquire and monitor the brain activity of a female bottle-nose dolphin. This EEG sensor was able to show that dolphin activity at rest is characterized by high spectral power at slow-frequencies bands. When the dolphin participated in DAT, a 23.53% increment in the 12–30 Hz frequency band was observed, but this only occurred for patients with some disease or disorder, given that 0.5–4 Hz band keeps it at 17.91% when there is a control patient. Regarding the fractal or Self-Affine Analysis, we found for all samples studied that at the beginning the dolphin’s brain activity behaved as a self-affine fractal described by a power-law until the fluctuations of voltage reached the crossovers, and after the crossovers these fluctuations left this scaling behavior. Hence, our findings validate the hypothesis that the participation in a DAT of a Patient with a certain disease or disorder modifies the usual behavior of a female bottle-nose dolphin.


2010 ◽  
Vol 24 (2) ◽  
pp. 131-135 ◽  
Author(s):  
Włodzimierz Klonowski ◽  
Pawel Stepien ◽  
Robert Stepien

Over 20 years ago, Watt and Hameroff (1987 ) suggested that consciousness may be described as a manifestation of deterministic chaos in the brain/mind. To analyze EEG-signal complexity, we used Higuchi’s fractal dimension in time domain and symbolic analysis methods. Our results of analysis of EEG-signals under anesthesia, during physiological sleep, and during epileptic seizures lead to a conclusion similar to that of Watt and Hameroff: Brain activity, measured by complexity of the EEG-signal, diminishes (becomes less chaotic) when consciousness is being “switched off”. So, consciousness may be described as a manifestation of deterministic chaos in the brain/mind.


1999 ◽  
Vol 13 (2) ◽  
pp. 117-125 ◽  
Author(s):  
Laurence Casini ◽  
Françoise Macar ◽  
Marie-Hélène Giard

Abstract The experiment reported here was aimed at determining whether the level of brain activity can be related to performance in trained subjects. Two tasks were compared: a temporal and a linguistic task. An array of four letters appeared on a screen. In the temporal task, subjects had to decide whether the letters remained on the screen for a short or a long duration as learned in a practice phase. In the linguistic task, they had to determine whether the four letters could form a word or not (anagram task). These tasks allowed us to compare the level of brain activity obtained in correct and incorrect responses. The current density measures recorded over prefrontal areas showed a relationship between the performance and the level of activity in the temporal task only. The level of activity obtained with correct responses was lower than that obtained with incorrect responses. This suggests that a good temporal performance could be the result of an efficacious, but economic, information-processing mechanism in the brain. In addition, the absence of this relation in the anagram task results in the question of whether this relation is specific to the processing of sensory information only.


2016 ◽  
Vol 25 (2) ◽  
pp. 225-242
Author(s):  
Cal Revely-Calder

Critics have recently begun to pay attention to the influence Jean Racine's plays had on the work of Samuel Beckett, noting his 1930–31 lectures at Trinity College Dublin, and echoes of Racine in early texts such as Murphy (1938). This essay suggests that as well as the Trinity lectures, Beckett's later re-reading of Racine (in 1956) can be seen as fundamentally influential on his drama. There are moments of direct allusion to Racine's work, as in Oh les beaux jours (1963), where the echoes are easily discernible; but I suggest that soon, in particular with Come and Go (1965), the characteristics of a distinctly Racinian stagecraft become more subtly apparent, in what Danièle de Ruyter has called ‘choix plus spécifiquement théâtraux’: pared-down lighting, carefully-crafted entries and exits, and visual tableaux made increasingly difficult to read. Through an account of Racine's dramaturgy, and the ways in which he structures bodily motion and theatrical talk, I suggest that Beckett's post-1956 drama can be better understood, as stage-spectacles, in the light of Racine's plays; both writers give us, in Myriam Jeantroux's phrase, the complicated spectacle of ‘un lieu à la fois désert et clôturé’. As spectators to Beckett's drama, by keeping Racine in mind we can come to understand better the limitations of that spectatorship, and how the later plays trouble our ability to see – and interpret – the figures that move before us.


Sign in / Sign up

Export Citation Format

Share Document