scholarly journals Invariance in pitch perception

2022 ◽  
Author(s):  
Malinda J McPherson ◽  
Josh H McDermott

Information in speech and music is often conveyed through changes in fundamental frequency (f0), the perceptual correlate of which is known as "pitch". One challenge of extracting this information is that such sounds can also vary in their spectral content due to the filtering imposed by a vocal tract or instrument body. Pitch is envisioned as invariant to spectral shape, potentially providing a solution to this challenge, but the extent and nature of this invariance remain poorly understood. We examined the extent to which human pitch judgments are invariant to spectral differences between natural sounds. Listeners performed up/down and interval discrimination tasks with spoken vowels, instrument notes, or synthetic tones, synthesized to be either harmonic or inharmonic (lacking a well-defined f0). Listeners were worse at discriminating pitch across different vowel and instrument sounds compared to when vowels/instruments were the same, being biased by differences in the spectral centroids of the sounds being compared. However, there was no interaction between this effect and that of inharmonicity. In addition, this bias decreased when sounds were separated by short delays. This finding suggests that the representation of a sound's pitch is itself unbiased, but that pitch comparisons between sounds are influenced by changes in timbre, the effect of which weakens over time. Pitch representations thus appears to be relatively invariant to spectral shape. But relative pitch judgments are not, even when spectral shape variation is naturalistic, and when such judgments are based on representations of the f0.

2020 ◽  
Vol 63 (4) ◽  
pp. 931-947
Author(s):  
Teresa L. D. Hardy ◽  
Carol A. Boliek ◽  
Daniel Aalto ◽  
Justin Lewicke ◽  
Kristopher Wells ◽  
...  

Purpose The purpose of this study was twofold: (a) to identify a set of communication-based predictors (including both acoustic and gestural variables) of masculinity–femininity ratings and (b) to explore differences in ratings between audio and audiovisual presentation modes for transgender and cisgender communicators. Method The voices and gestures of a group of cisgender men and women ( n = 10 of each) and transgender women ( n = 20) communicators were recorded while they recounted the story of a cartoon using acoustic and motion capture recording systems. A total of 17 acoustic and gestural variables were measured from these recordings. A group of observers ( n = 20) rated each communicator's masculinity–femininity based on 30- to 45-s samples of the cartoon description presented in three modes: audio, visual, and audio visual. Visual and audiovisual stimuli contained point light displays standardized for size. Ratings were made using a direct magnitude estimation scale without modulus. Communication-based predictors of masculinity–femininity ratings were identified using multiple regression, and analysis of variance was used to determine the effect of presentation mode on perceptual ratings. Results Fundamental frequency, average vowel formant, and sound pressure level were identified as significant predictors of masculinity–femininity ratings for these communicators. Communicators were rated significantly more feminine in the audio than the audiovisual mode and unreliably in the visual-only mode. Conclusions Both study purposes were met. Results support continued emphasis on fundamental frequency and vocal tract resonance in voice and communication modification training with transgender individuals and provide evidence for the potential benefit of modifying sound pressure level, especially when a masculine presentation is desired.


Animals ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. 167 ◽  
Author(s):  
Anton Baotic ◽  
Maxime Garcia ◽  
Markus Boeckle ◽  
Angela Stoeger

African savanna elephants live in dynamic fission–fusion societies and exhibit a sophisticated vocal communication system. Their most frequent call-type is the ‘rumble’, with a fundamental frequency (which refers to the lowest vocal fold vibration rate when producing a vocalization) near or in the infrasonic range. Rumbles are used in a wide variety of behavioral contexts, for short- and long-distance communication, and convey contextual and physical information. For example, maturity (age and size) is encoded in male rumbles by formant frequencies (the resonance frequencies of the vocal tract), having the most informative power. As sound propagates, however, its spectral and temporal structures degrade progressively. Our study used manipulated and resynthesized male social rumbles to simulate large and small individuals (based on different formant values) to quantify whether this phenotypic information efficiently transmits over long distances. To examine transmission efficiency and the potential influences of ecological factors, we broadcasted and re-recorded rumbles at distances of up to 1.5 km in two different habitats at the Addo Elephant National Park, South Africa. Our results show that rumbles were affected by spectral–temporal degradation over distance. Interestingly and unlike previous findings, the transmission of formants was better than that of the fundamental frequency. Our findings demonstrate the importance of formant frequencies for the efficiency of rumble propagation and the transmission of information content in a savanna elephant’s natural habitat.


1992 ◽  
Vol 35 (4) ◽  
pp. 761-768 ◽  
Author(s):  
Petra Zwirner ◽  
Gary J. Barnes

Acoustic analyses of upper airway and phonatory stability were conducted on samples of sustained phonation to evaluate the relation between laryngeal and articulomotor stability for 31 patients with dysarthria and 12 non-dysarthric control subjects. Significantly higher values were found for the variability in fundamental frequency and formant frequency of patients who have Huntington’s disease compared with normal subjects and patients with Parkinson’s disease. No significant correlations were found between formant frequency variability and the variability of the fundamental frequency for any subject group. These findings are discussed as they pertain to the relationship between phonatory and upper airway subsystems and the evaluation of vocal tract motor control impairments in dysarthria.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Kerry MM Walker ◽  
Ray Gonzalez ◽  
Joe Z Kang ◽  
Josh H McDermott ◽  
Andrew J King

Pitch perception is critical for recognizing speech, music and animal vocalizations, but its neurobiological basis remains unsettled, in part because of divergent results across species. We investigated whether species-specific differences exist in the cues used to perceive pitch and whether these can be accounted for by differences in the auditory periphery. Ferrets accurately generalized pitch discriminations to untrained stimuli whenever temporal envelope cues were robust in the probe sounds, but not when resolved harmonics were the main available cue. By contrast, human listeners exhibited the opposite pattern of results on an analogous task, consistent with previous studies. Simulated cochlear responses in the two species suggest that differences in the relative salience of the two pitch cues can be attributed to differences in cochlear filter bandwidths. The results support the view that cross-species variation in pitch perception reflects the constraints of estimating a sound’s fundamental frequency given species-specific cochlear tuning.


2020 ◽  
Author(s):  
Frank Russo ◽  
Dominique T Vuvan ◽  
William Forde Thompson

Note-to-note changes in brightness are able to influence the perception of interval size. Changes that are congruent with pitch tend to expand interval size, whereas changes that are incongruent tend to contract. In the case of singing, brightness of notes can vary as a function of vowel content. In the present study, we investigated whether note-to-note changes in brightness arising from vowel content influence perception of relative pitch. In Experiment 1, three-note sequences were synthesized so that they varied with regard to the brightness of vowels from note to note. As expected, brightness influenced judgments of interval size. Changes in brightness that were congruent with changes in pitch led to an expansion of perceived interval size. A follow-up experiment confirmed that the results of Experiment 1 were not due to pitch distortions. In Experiment 2, the final note of three-note sequences was removed, and participants were asked to make speeded judgments of the pitch contour. An analysis of response times revealed that brightness of vowels influenced contour judgments. Changes in brightness that were congruent with changes in pitch led to faster response times than did incongruent changes. These findings show that the brightness of vowels yields an extra-pitch influence on the perception of relative pitch in song.


Author(s):  
Radhika Rani L ◽  
S. Chandra lingam ◽  
Anjaneyulu T ◽  
Satyanarayana K

Congenital Heart Defects (CHD) are the critical heart disorders that can be observed at the birth stage of the infants. These are classified mainly into two, Cyanotic and Acyanotic. The present paper concentrates on the Acyanotic heart disorders. Acyanotic heart disorder cannot be observed on external checkup, whereas bluish skin is the indication for the infant affected with Cyanotic disorder. Acyanotic heart disorder can only be diagnosed using chest X-Ray, ECG, Echocardiogram, Cardiac Catheterization and MRI of the Heart. The present work aims at estimating the fundamental frequency (pitch) and the vocal tract resonant frequencies (formants) from the cry signal of the infants. The pitch frequency and formant frequencies are estimated using frequency domain (Cepstrum) and Linear Prediction Code (LPC) methods. The results show that the fundamental frequency of the cry signal was between 600Hz-800Hz for the infants with Acyanotic heart disorders. This fundamental frequency helps in identifying Acyanotic heart disorders at an early stage.


Author(s):  
Joseph D Wagner ◽  
Alice Gelman ◽  
Kenneth E. Hancock ◽  
Yoojin Chung ◽  
Bertrand Delgutte

The pitch of harmonic complex tones (HCT) common in speech, music and animal vocalizations plays a key role in the perceptual organization of sound. Unraveling the neural mechanisms of pitch perception requires animal models but little is known about complex pitch perception by animals, and some species appear to use different pitch mechanisms than humans. Here, we tested rabbits' ability to discriminate the fundamental frequency (F0) of HCTs with missing fundamentals using a behavioral paradigm inspired by foraging behavior in which rabbits learned to harness a spatial gradient in F0 to find the location of a virtual target within a room for a food reward. Rabbits were initially trained to discriminate HCTs with F0s in the range 400-800 Hz and with harmonics covering a wide frequency range (800-16,000 Hz), and then tested with stimuli differing either in spectral composition to test the role of harmonic resolvability (Experiment 1), or in F0 range (Experiment 2), or both F0 and spectral content (Experiment 3). Together, these experiments show that rabbits can discriminate HCTs over a wide F0 range (200-1600 Hz) encompassing the range of conspecific vocalizations, and can use either the spectral pattern of harmonics resolved by the cochlea for higher F0s or temporal envelope cues resulting from interaction between unresolved harmonics for lower F0s. The qualitative similarity of these results to human performance supports using rabbits as an animal model for studies of pitch mechanisms providing species differences in cochlear frequency selectivity and F0 range of vocalizations are taken into account.


2019 ◽  
Vol 37 (1) ◽  
pp. 57-65 ◽  
Author(s):  
Frank A. Russo ◽  
Dominique T. Vuvan ◽  
William Forde Thompson

Note-to-note changes in brightness are able to influence the perception of interval size. Changes that are congruent with pitch tend to expand interval size, whereas changes that are incongruent tend to contract. In the case of singing, brightness of notes can vary as a function of vowel content. In the present study, we investigated whether note-to-note changes in brightness arising from vowel content influence perception of relative pitch. In Experiment 1, three-note sequences were synthesized so that they varied with regard to the brightness of vowels from note to note. As expected, brightness influenced judgments of interval size. Changes in brightness that were congruent with changes in pitch led to an expansion of perceived interval size. A follow-up experiment confirmed that the results of Experiment 1 were not due to pitch distortions. In Experiment 2, the final note of three-note sequences was removed, and participants were asked to make speeded judgments of the pitch contour. An analysis of response times revealed that brightness of vowels influenced contour judgments. Changes in brightness that were congruent with changes in pitch led to faster response times than did incongruent changes. These findings show that the brightness of vowels yields an extra-pitch influence on the perception of relative pitch in song.


OTO Open ◽  
2019 ◽  
Vol 3 (3) ◽  
pp. 2473974X1986638
Author(s):  
Jacob I. Tower ◽  
Lynn Acton ◽  
Jessica Wolf ◽  
Walton Wilson ◽  
Nwanmegha Young

Objective The purpose of this study was to investigate the effect of vocal training on acoustic and aerodynamic characteristics of student actors’ voices. Study Design Prospective cohort study. Setting Tertiary medical facility speech and swallow center. Subjects and Methods Acoustic, aerodynamic, and Voice Handicap Index–10 measures were collected from 14 first-year graduate-level drama students before and after a standard vocal training program and analyzed for changes over time. Results Among the aerodynamic measures that were collected, mean expiratory airflow was significantly reduced after vocal training. Among the acoustic measures that were collected, mean fundamental frequency was significantly increased after vocal training. On average, Voice Handicap Index–10 scores were unchanged after vocal training. Conclusion The cohort of drama students undergoing vocal training demonstrated improvements in voice aerodynamics, which indicate enhanced glottal efficiency after training. The present study also found an increased average fundamental frequency among the actors during sustained voicing and no changes in jitter and shimmer despite frequent performance.


NeuroImage ◽  
2019 ◽  
Vol 200 ◽  
pp. 132-141 ◽  
Author(s):  
Simon Leipold ◽  
Marielle Greber ◽  
Silvano Sele ◽  
Lutz Jäncke

Sign in / Sign up

Export Citation Format

Share Document