scholarly journals Fast and accurate annotation of acoustic signals with deep neural networks

2021 ◽  
Author(s):  
Elsa Steinfath ◽  
Adrian Palacios ◽  
Julian Rottschaefer ◽  
Deniz Yuezak ◽  
Jan Clemens

Acoustic signals serve communication within and across species throughout the animal kingdom. Studying the genetics, evolution, and neurobiology of acoustic communication requires annotating acoustic signals: segmenting and identifying individual acoustic elements like syllables or sound pulses. To be useful, annotations need to be accurate, robust to noise, fast. We introduce DeepSS, a method that annotates acoustic signals across species based on a deep-learning derived hierarchical presentation of sound. We demonstrate the accuracy, robustness, and speed of DeepSS using acoustic signals with diverse characteristics: courtship song from flies, ultrasonic vocalizations of mice, and syllables with complex spectrotemporal structure from birds. DeepSS comes with a graphical user interface for annotating song, training the network, and for generating and proofreading annotations (available at https://janclemenslab.org/deepss). The method can be trained to annotate signals from new species with little manual annotation and can be combined with unsupervised methods to discover novel signal types. DeepSS annotates song with high throughput and low latency, allowing realtime annotations for closed-loop experimental interventions.

eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Elsa Steinfath ◽  
Adrian Palacios-Muñoz ◽  
Julian R Rottschäfer ◽  
Deniz Yuezak ◽  
Jan Clemens

Acoustic signals serve communication within and across species throughout the animal kingdom. Studying the genetics, evolution, and neurobiology of acoustic communication requires annotating acoustic signals: segmenting and identifying individual acoustic elements like syllables or sound pulses. To be useful, annotations need to be accurate, robust to noise, and fast.We here introduce DeepAudioSegmenter (DAS), a method that annotates acoustic signals across species based on a deep-learning derived hierarchical presentation of sound. We demonstrate the accuracy, robustness, and speed of DAS using acoustic signals with diverse characteristics from insects, birds, and mammals. DAS comes with a graphical user interface for annotating song, training the network, and for generating and proofreading annotations. The method can be trained to annotate signals from new species with little manual annotation and can be combined with unsupervised methods to discover novel signal types. DAS annotates song with high throughput and low latency for experimental interventions in realtime. Overall, DAS is a universal, versatile, and accessible tool for annotating acoustic communication signals.


2000 ◽  
Vol 75 (1) ◽  
pp. 37-45 ◽  
Author(s):  
ANNELI HOIKKALA ◽  
SELIINA PÄÄLLYSAHO ◽  
JOUNI ASPI ◽  
JAAKKO LUMME

The males of six species of the Drosophila virilis group (including D. virilis) keep their wings extended while producing a train of sound pulses, where the pulses follow each other without any pause. The males of the remaining five species of the group produce only one sound pulse during each wing extension/vibration, which results in species-specific songs with long pauses (in D. littoralis about 300 ms) between successive sound pulses. Genetic analyses of the differences between the songs of D. virilis and D. littoralis showed that species-specific song traits are affected by genes on the X chromosome, and for the length of pause, also by genes on chromosomes 3 and 4. The X chromosomal genes having a major impact on pulse and pause length were tightly linked with white, apricot and notched marker genes located at the proximal third of the chromosome. A large inversion in D. littoralis, marked by notched, prevents more precise localization of these genes by classical crossing methods.


2020 ◽  
Vol 16 (4) ◽  
pp. 20190928 ◽  
Author(s):  
Ella Z. Lattenkamp ◽  
Sonja C. Vernes ◽  
Lutz Wiegrebe

Vocal production learning (VPL), or the ability to modify vocalizations through the imitation of sounds, is a rare trait in the animal kingdom. While humans are exceptional vocal learners, few other mammalian species share this trait. Owing to their singular ecology and lifestyle, bats are highly specialized for the precise emission and reception of acoustic signals. This specialization makes them ideal candidates for the study of vocal learning, and several bat species have previously shown evidence supportive of vocal learning. Here we use a sophisticated automated set-up and a contingency training paradigm to explore the vocal learning capacity of pale spear-nosed bats. We show that these bats are capable of directional change of the fundamental frequency of their calls according to an auditory target. With this study, we further highlight the importance of bats for the study of vocal learning and provide evidence for the VPL capacity of the pale spear-nosed bat.


Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3530
Author(s):  
Juan Parras ◽  
Santiago Zazo ◽  
Iván A. Pérez-Álvarez ◽  
José Luis Sanz González

In recent years, there has been a significant effort towards developing localization systems in the underwater medium, with current methods relying on anchor nodes, explicitly modeling the underwater channel or cooperation from the target. Lately, there has also been some work on using the approximation capabilities of Deep Neural Networks in order to address this problem. In this work, we study how the localization precision of using Deep Neural Networks is affected by the variability of the channel, the noise level at the receiver, the number of neurons of the neural network and the utilization of the power or the covariance of the received acoustic signals. Our study shows that using deep neural networks is a valid approach when the channel variability is low, which opens the door to further research in such localization methods for the underwater environment.


2021 ◽  
Author(s):  
Aditya Saluja

Fused Filament Fabrication (FFF) is an additive manufacturing technique commonly used in industry to produce complicated structures sustainably. Although promising, the technology frequently suffers from defects, including warp deformation compromising the structural integrity of the component and, in extreme cases, the printer itself. To avoid the adverse effects of warp deformation, this thesis explores the implementation of deep neural networks to form a closed-loop in-process monitoring architecture using Convolutional Neural Networks (CNN) capable of pausing a printer once a warp is detected. Any neural network, including CNNs, depend on their hyperparameters. Hyperparameters can either be optimized using a manual or an automated approach. A manual approach, although easier to program, is often time-consuming, inaccurate and computationally inefficient, necessitating an automated approach. To evaluate this statement, classification models were optimized through both approaches and tested in a laboratory scaled manufacturing environment. The automated approach utilized a Bayesianbased optimizer yielding a mean accuracy of 100% significantly higher than 36% achieved by the other approach.


2011 ◽  
Vol 57 (2) ◽  
pp. 187-196 ◽  
Author(s):  
Christopher James Clark

Abstract Multi-component signals contain multiple signal parts expressed in the same physical modality. One way to identify individual components is if they are produced by different physical mechanisms. Here, I studied the mechanisms generating acoustic signals in the courtship displays of the Calliope hummingbird Stellula calliope. Display dives consisted of three synchronized sound elements, a high-frequency tone (hft), a low frequency tone (lft), and atonal sound pulses (asp), which were then followed by a frequency-modulated fall. Manipulating any of the rectrices (tail-feathers) of wild males impaired production of the lft and asp but not the hft or fall, which are apparently vocal. I tested the sound production capabilities of the rectrices in a wind tunnel. Single rectrices could generate the lft but not the asp, whereas multiple rectrices tested together produced sounds similar to the asp when they fluttered and collided with their neighbors percussively, representing a previously unknown mechanism of sound production. During the shuttle display, a trill is generated by the wings during pulses in which the wingbeat frequency is elevated to 95 Hz, 40% higher than the typical hovering wingbeat frequency. The Calliope hummingbird courtship displays include sounds produced by three independent mechanisms, and thus include a minimum of three acoustic signal components. These acoustic mechanisms have different constraints and thus potentially contain different messages. Producing multiple acoustic signals via multiple mechanisms may be a way to escape the constraints present in any single mechanism.


2021 ◽  
Author(s):  
Vidushi Pathak ◽  
Elsa Juan ◽  
Reina van der Goot ◽  
Lucia Talamini

Study Objective: Sleep is critical for physical and mental health. However, sleep disruption due to noise is a growing problem, causing long-lasting distress and fragilizing entire populations mentally and physically. Here for the first time, we tested an innovative and non-invasive potential countermeasure for sleep disruptions due to noise. Methods: We developed a new, modeling-based, closed-loop acoustic neurostimulation procedure (CLNS) to precisely phase-lock stimuli to slow oscillations (SO). We used CLNS to align, soft sound pulses to the start of the SO positive deflection to boost SO and sleep spindles during non-rapid eye movement (NREM) sleep. Participants underwent three overnight EEG recordings. The first night served to determine each participant ′s individual noise arousal threshold. The remaining two nights occurred in counterbalanced order: in the Disturbing night, loud, real-life noises were repeatedly presented; in the Intervention night, similar loud noises were played while using the CLNS to boost SO. All experimental manipulations were performed in the first three hours of sleep; participants slept undisturbed for the rest of the night. Results: In contrast to the Disturbing night, the probability of arousals caused by noise was significantly decreased in the Intervention night. Moreover, the CLNS intervention increased NREM duration and sleep spindle power across the night. Conclusions: These results show that our CLNS procedure can effectively protect sleep from disruptions caused by noise. Remarkably, even in the presence of loud environmental noise, CLNS ′ soft and precisely timed sound pulses played a beneficial role in protecting sleep continuity. This represents the first successful attempt at using CLNS in a noisy environment.


Author(s):  
Gonçalo Lopes ◽  
Karolina Farrell ◽  
Edward A. B. Horrocks ◽  
Chi-Yu Lee ◽  
Mai M. Morimoto ◽  
...  

Real-time rendering of closed-loop visual environments is necessary for next-generation understanding of brain function and behaviour, but is prohibitively difficult for non-experts to implement and is limited to few laboratories worldwide. We developed BonVision as an easy-to-use open-source software for the display of virtual or augmented reality, as well as standard visual stimuli. As the architecture is based on the open-source Bonsai graphical programming language, BonVision benefits from native integration with experimental hardware. BonVision therefore enables easy implementation of closed-loop experiments, including real-time interaction with deep neural networks and communication with behavioural and physiological measurement and manipulation devices.


Author(s):  
Roger K Moore

Recent years have seen an explosion in the availability of Voice User Interfaces. However, user surveys suggest that there are issues with respect to usability, and it has been hypothesised that contemporary voice-enabled systems are missing crucial behaviours relating to user engagement and vocal interactivity. However, it is well established that such ostensive behaviours are ubiquitous in the animal kingdom, and that vocalisation provides a means through which interaction may be coordinated and managed between individuals and within groups. Hence, this paper reports results from a study aimed at identifying generic mechanisms that might underpin coordinated collective vocal behaviour with a particular focus on closed-loop negative-feedback control as a powerful regulatory process. A computer-based real-time simulation of vocal interactivity is described which has provided a number of insights, including the enumeration of a number of key control variables that may be worthy of further investigation.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Gonçalo Lopes ◽  
Karolina Farrell ◽  
Edward A B Horrocks ◽  
Chi Yu Lee ◽  
Mai M Morimoto ◽  
...  

Real-time rendering of closed-loop visual environments is important for next-generation understanding of brain function and behaviour, but is often prohibitively difficult for non-experts to implement and is limited to few laboratories worldwide. We developed BonVision as an easy-to-use open-source software for the display of virtual or augmented reality, as well as standard visual stimuli. BonVision has been tested on humans and mice, and is capable of supporting new experimental designs in other animal models of vision. As the architecture is based on the open-source Bonsai graphical programming language, BonVision benefits from native integration with experimental hardware. BonVision therefore enables easy implementation of closed-loop experiments, including real-time interaction with deep neural networks, and communication with behavioural and physiological measurement and manipulation devices.


Sign in / Sign up

Export Citation Format

Share Document