vocal signals
Recently Published Documents


TOTAL DOCUMENTS

179
(FIVE YEARS 41)

H-INDEX

24
(FIVE YEARS 2)

2021 ◽  
Vol 12 ◽  
Author(s):  
Michel Bürgel ◽  
Lorenzo Picinali ◽  
Kai Siedenburg

Listeners can attend to and track instruments or singing voices in complex musical mixtures, even though the acoustical energy of sounds from individual instruments may overlap in time and frequency. In popular music, lead vocals are often accompanied by sound mixtures from a variety of instruments, such as drums, bass, keyboards, and guitars. However, little is known about how the perceptual organization of such musical scenes is affected by selective attention, and which acoustic features play the most important role. To investigate these questions, we explored the role of auditory attention in a realistic musical scenario. We conducted three online experiments in which participants detected single cued instruments or voices in multi-track musical mixtures. Stimuli consisted of 2-s multi-track excerpts of popular music. In one condition, the target cue preceded the mixture, allowing listeners to selectively attend to the target. In another condition, the target was presented after the mixture, requiring a more “global” mode of listening. Performance differences between these two conditions were interpreted as effects of selective attention. In Experiment 1, results showed that detection performance was generally dependent on the target’s instrument category, but listeners were more accurate when the target was presented prior to the mixture rather than the opposite. Lead vocals appeared to be nearly unaffected by this change in presentation order and achieved the highest accuracy compared with the other instruments, which suggested a particular salience of vocal signals in musical mixtures. In Experiment 2, filtering was used to avoid potential spectral masking of target sounds. Although detection accuracy increased for all instruments, a similar pattern of results was observed regarding the instrument-specific differences between presentation orders. In Experiment 3, adjusting the sound level differences between the targets reduced the effect of presentation order, but did not affect the differences between instruments. While both acoustic manipulations facilitated the detection of targets, vocal signals remained particularly salient, which suggest that the manipulated features did not contribute to vocal salience. These findings demonstrate that lead vocals serve as robust attractor points of auditory attention regardless of the manipulation of low-level acoustical cues.


Author(s):  
Poovarasan Selvaraj ◽  
E. Chandra

In Speech Enhancement (SE) techniques, the major challenging task is to suppress non-stationary noises including white noise in real-time application scenarios. Many techniques have been developed for enhancing the vocal signals; however, those were not effective for suppressing non-stationary noises very well. Also, those have high time and resource consumption. As a result, Sliding Window Empirical Mode Decomposition and Hurst (SWEMDH)-based SE method where the speech signal was decomposed into Intrinsic Mode Functions (IMFs) based on the sliding window and the noise factor in each IMF was chosen based on the Hurst exponent data. Also, the least corrupted IMFs were utilized to restore the vocal signal. However, this technique was not suitable for white noise scenarios. Therefore in this paper, a Variant of Variational Mode Decomposition (VVMD) with SWEMDH technique is proposed to reduce the complexity in real-time applications. The key objective of this proposed SWEMD-VVMDH technique is to decide the IMFs based on Hurst exponent and then apply the VVMD technique to suppress both low- and high-frequency noisy factors from the vocal signals. Originally, the noisy vocal signal is decomposed into many IMFs using SWEMDH technique. Then, Hurst exponent is computed to decide the IMFs with low-frequency noisy factors and Narrow-Band Components (NBC) is computed to decide the IMFs with high-frequency noisy factors. Moreover, VVMD is applied on the addition of all chosen IMF to remove both low- and high-frequency noisy factors. Thus, the speech signal quality is improved under non-stationary noises including additive white Gaussian noise. Finally, the experimental outcomes demonstrate the significant speech signal improvement under both non-stationary and white noise surroundings.


Author(s):  
Theresa Matzinger ◽  
W. Tecumseh Fitch

Voice modulatory cues such as variations in fundamental frequency, duration and pauses are key factors for structuring vocal signals in human speech and vocal communication in other tetrapods. Voice modulation physiology is highly similar in humans and other tetrapods due to shared ancestry and shared functional pressures for efficient communication. This has led to similarly structured vocalizations across humans and other tetrapods. Nonetheless, in their details, structural characteristics may vary across species and languages. Because data concerning voice modulation in non-human tetrapod vocal production and especially perception are relatively scarce compared to human vocal production and perception, this review focuses on voice modulatory cues used for speech segmentation across human languages, highlighting comparative data where available. Cues that are used similarly across many languages may help indicate which cues may result from physiological or basic cognitive constraints, and which cues may be employed more flexibly and are shaped by cultural evolution. This suggests promising candidates for future investigation of cues to structure in non-human tetrapod vocalizations. This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part I)’.


2021 ◽  
Author(s):  
Joanna Furmankiewicz ◽  
Gareth Jones

AbstractCommunication between group members is mediated by a diverse range of signals. Contact calls are produced by many species of birds and mammals to maintain group cohesion and associations among individuals. Contact calls in bats are typically relatively low-frequency social calls, produced only for communication. However, echolocation calls (higher in frequency and used primarily for orientation and prey detection) can also facilitate interaction among individuals and location of conspecifics in the roost. We studied calling behaviour of brown long-eared bats (Plecotus auritus) during return to maternity roosts in response to playbacks of social and echolocation calls. We hypothesised that calling by conspecifics would elicit responses in colony members. Bat responses (inspection flights and social calls production) were significantly highest during social call and echolocation call playbacks than during noise (control) playbacks. We suggest that social calling in maternity roosts of brown long-eared bat evolved to maintain associations among roostmates, rather than to find roosts or roostmates, because this species is strongly faithful to roosts and the social groups and roosts are stable over time and space. Living in a stable social group requires recognition of group members and affiliation of social bonds with group members, features that may be mediated by vocal signals.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Aurore Cazala ◽  
Catherine Del Negro ◽  
Nicolas Giret

AbstractThe ability of the auditory system to selectively recognize natural sound categories while maintaining a certain degree of tolerance towards variations within these categories, which may have functional roles, is thought to be crucial for vocal communication. To date, it is still largely unknown how the balance between tolerance and sensitivity to variations in acoustic signals is coded at a neuronal level. Here, we investigate whether neurons in a high-order auditory area in zebra finches, a songbird species, are sensitive to natural variations in vocal signals by recording their responses to repeated exposures to identical and variant sound sequences. We used the songs of male birds which tend to be highly repetitive with only subtle variations between renditions. When playing these songs to both anesthetized and awake birds, we found that variations between songs did not affect the neuron firing rate but the temporal reliability of responses. This suggests that auditory processing operates on a range of distinct timescales, namely a short one to detect variations in vocal signals, and longer ones that allow the birds to tolerate variations in vocal signal structure and to encode the global context.


2021 ◽  
Author(s):  
Lingyun Zhao ◽  
Xiaoqin Wang

Vocal communication is essential for social behaviors in humans and many non-human primates. While the frontal cortex has been shown to play a crucial role in human speech production, its role in vocal production in non-human primates has long been questioned. Recent studies have shown activation in single neurons in the monkey frontal cortex during vocal production in relatively isolated environment. However, little is known about how the frontal cortex is engaged in vocal production in ethologically relevant social context, where different types of vocal signals are produced for various communication purposes. Here we studied single neuron activities and local field potentials (LFP) and in the frontal cortex of marmoset monkeys while the animal engaged in vocal exchanges with other conspecifics in a social environment. Marmosets most frequently produced four types of vocalizations with distinct acoustic structures, three of which were typically not produced in isolation. We found that both single neuron activities and LFP were modulated by the production of each of the four call types. Moreover, the neural modulations in the frontal cortex showed distinct patterns for different call types, suggesting a representation of vocal signal features. In addition, we found that theta-band LFP oscillations were phase-locked to the phrases of twitter calls, which indicates the coordination of temporal structures of vocalizations. Our results suggested important functions of the marmoset frontal cortex in supporting the production of diverse vocalizations in vocal communication.


2021 ◽  
Vol 15 ◽  
Author(s):  
Arianna Gentile Polese ◽  
Sunny Nigam ◽  
Laura M. Hurley

Neuromodulatory systems may provide information on social context to auditory brain regions, but relatively few studies have assessed the effects of neuromodulation on auditory responses to acoustic social signals. To address this issue, we measured the influence of the serotonergic system on the responses of neurons in a mouse auditory midbrain nucleus, the inferior colliculus (IC), to vocal signals. Broadband vocalizations (BBVs) are human-audible signals produced by mice in distress as well as by female mice in opposite-sex interactions. The production of BBVs is context-dependent in that they are produced both at early stages of interactions as females physically reject males and at later stages as males mount females. Serotonin in the IC of males corresponds to these events, and is elevated more in males that experience less female rejection. We measured the responses of single IC neurons to five recorded examples of BBVs in anesthetized mice. We then locally activated the 5-HT1A receptor through iontophoretic application of 8-OH-DPAT. IC neurons showed little selectivity for different BBVs, but spike trains were characterized by local regions of high spike probability, which we called “response features.” Response features varied across neurons and also across calls for individual neurons, ranging from 1 to 7 response features for responses of single neurons to single calls. 8-OH-DPAT suppressed spikes and also reduced the numbers of response features. The weakest response features were the most likely to disappear, suggestive of an “iceberg”-like effect in which activation of the 5-HT1A receptor suppressed weakly suprathreshold response features below the spiking threshold. Because serotonin in the IC is more likely to be elevated for mounting-associated BBVs than for rejection-associated BBVs, these effects of the 5-HT1A receptor could contribute to the differential auditory processing of BBVs in different behavioral subcontexts.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12085
Author(s):  
Marie Souhaut ◽  
Monika W. Shields

The endangered Southern Resident killer whales (Orcinus orca) of the northeast Pacific region use two main types of vocal signals to communicate: discrete calls and whistles. Despite being one of the most-studied cetacean populations in the world, whistles have not been as heavily analyzed due to their relatively low occurrence compared to discrete calls. The aim of the current study is to further investigate the whistle repertoire and characteristics of the Southern Resident killer whale population. Acoustic data were collected between 2006–2007 and 2015–2017 in the waters around San Juan Island, Washington State, USA from boats and from shore. A total of 228 whistles were extracted and analyzed with 53.5% of them found to be stereotyped. Three of the four stereotyped whistles identified by a previous study using recordings from 1979–1982 were still occurring, demonstrating that whistles are stable vocalizations for a period of more than 35 years. The presence of three new stereotyped whistles was also documented. These results demonstrate that whistles share the longevity and vocal tradition of discrete calls, and warrant further study as a key element of Southern Resident killer whale communication and cultural transmission.


2021 ◽  
Author(s):  
Aurore Cazala ◽  
Catherine Del Negro ◽  
Nicolas Giret

The ability of the auditory system to selectively recognize natural sound categories with a tolerance to variations within categories is thought to be crucial for vocal communication. Subtle variations, however, may have functional roles. To date, how the coding of the balance between tolerance and sensitivity to variations in acoustic signals is performed at the neuronal level requires further studies. We investigated whether neurons of a high-order auditory area in a songbird species, the zebra finch, are sensitive to natural variations in vocal signals by recording responses to repeated exposure to similar and variant sound sequences. We took advantage of the intensive repetition of the male songs which subtly vary from rendition to rendition. In both anesthetized and awake birds, responses based on firing rate during sequence presentation did not show any clear sensitivity to these variations, unlike the temporal reliability of responses based on a 10 milliseconds resolution that depended on whether variant or similar sequences were broadcasted and the context of presentation. Results therefore suggest that auditory processing operates on distinct timescales, a short one to detect variations in individual's vocal signals, longer ones that allow tolerance in vocal signal structure and the encoding of the global context.


Bioacoustics ◽  
2021 ◽  
pp. 1-18
Author(s):  
Daniella Teixeira ◽  
Richard Hill ◽  
Michael Barth ◽  
Martine Maron ◽  
Berndt J. van Rensburg
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document