Neural correlates of the pitch of complex tones. I. Pitch and pitch salience

1996 ◽  
Vol 76 (3) ◽  
pp. 1698-1716 ◽  
Author(s):  
P. A. Cariani ◽  
B. Delgutte

1. The temporal discharge patterns of auditory nerve fibers in Dial-anesthetized cats were studied in response to periodic complex acoustic waveforms that evoke pitches at their fundamental frequencies. Single-formant vowels, amplitude-modulated (AM) and quasi-frequency-modulated tones. AM noise, click trains, and other complex tones were utilized. Distributions of intervals between successive spikes ("1st-order intervals") and between both successive and nonsuccessive spikes ("all-order intervals") were computed from spike trains. Intervals from many fibers were pooled to estimate interspike interval distributions for the entire auditory nerve. Properties of these "pooled interspike interval distributions," such as the positions of interval peaks and their relative heights, were examined for correspondence to the psychophysical data on pitch frequency and pitch salience. 2. For a diverse set of complex stimuli and levels, the most frequent all-order interspike interval present in the pooled distribution corresponded to the pitch heard in psychophysical experiments. Pitch estimates based on pooled interval distributions (30-85 fibers, 100 stimulus presentations per fiber) were highly accurate (within 1%) for harmonic stimuli that produce strong pitches at 60 dB SPL. 3. Although the most frequent intervals in pooled all-order interval distributions were very stable with respect to sound intensity level (40, 60, and 80 dB total SPL), this was not necessarily the case for first-order interval distributions. Because the low pitches of complex tones are largely invariant with respect to level, pitches estimated from all-order interval distributions correspond better to perception. 4. Spectrally diverse stimuli that evoke similar low pitches produce pooled interval distributions with similar most-frequent intervals. This suggests that the pitch equivalence of these different stimuli could result from central auditory processing mechanisms that analyze interspike interval patterns. 5. Complex stimuli that evoke strong or "salient" pitches produce pooled interval distributions with high peak-to-mean ratios. Those stimuli that evoke weak pitches produce pooled interval distributions with low peak-to-mean ratios. 6. Pooled interspike interval distributions for stimuli consisting of low-frequency components generally resembled the short-time auto-correlation function of stimulus waveforms. Pooled interval distributions for stimuli consisting of high-frequency components resembled the short-time autocorrelation function of the waveform envelope. 7. Interval distributions in populations of neurons constitute a general, distributed means of encoding, transmitting, and representing information. Existence of a central processor capable of analyzing these interval patterns could provide a unified explanation for many different aspects of pitch perception.

1996 ◽  
Vol 76 (3) ◽  
pp. 1717-1734 ◽  
Author(s):  
P. A. Cariani ◽  
B. Delgutte

1. The neural correlates of low pitches produced by complex tones were studied by analyzing temporal discharge patterns of auditory nerve fibers in Dial-anesthetized cats. In the previous paper it was observed that, for harmonic stimuli, the most frequent interspike interval present in the population of auditory nerve fibers always corresponded to the perceived pitch (predominant interval hypothesis). The fraction of these most frequent intervals relative to the total number of intervals qualitatively corresponded to strength (salience) of the low pitches that are heard. 2. This paper addresses the neural correlates of stimuli that produce more complex patterns of pitch judgments, such as shifts in pitch and multiple pitches. Correlates of pitch shift and pitch ambiguity were investigated with the use of harmonic and inharmonic amplitude-modulated (AM) tones varying either in carrier frequency or modulation frequency. Pitches estimated from the pooled interval distributions showed shifts corresponding to "the first effect of pitch shift" (de Boer's rule) that is observed psychophysically. Pooled interval distributions in response to inharmonic stimulus segments showed multiple maxima corresponding to the multiple pitches heard by human listeners (pitch ambiguity). 3. AM and quasi-frequency-modulated tones with low carrier frequencies produce very similar patterns of pitch judgments, despite great differences in their phase spectra and waveform envelopes. Pitches estimated from pooled interval distributions were remarkably similar for the two kinds of stimuli, consistent with the psychophysically observed phase invariance of pitches produced by sets of low-frequency components. 4. Trains of clicks having uniform and alternating polarities were used to investigate the relation between pitches associated with periodicity and those associated with click rate. For unipolar click trains, where periodicity and rate coincide, physiologically estimated pitches closely follow the fundamental period. This corresponds to the pitch at the fundamental frequency (F0) that is heard. For alternating click trains, where rate and periodicity do not coincide, physiologically estimated pitches always closely followed the fundamental period. Although these pitch estimates corresponded to periodicity pitches that are heard for F0s > 150 Hz, they did not correspond to the rate pitches that are heard for F0s < 150 Hz. The predominant interval hypothesis thus failed to predict rate pitch. 5. When alternating-polarity click trains are high-pass filtered, rate pitches are strengthened and can also be heard at F0s > 150 Hz. Pitches for high-pass-filtered alternating click trains were estimated from pooled responses of fibers with characteristic frequencies (CFs) > 2 kHz. Roughly equal numbers of intervals at 1/rate and 1/F0 were found for all F0s studied, from 80 to 160 Hz, producing pitch estimates consistent with the rate pitches that are heard after high-pass filtering. The existence region for rate pitch also coincided with the presence of clear periodicities related to the click rate in pooled peristimulus time histograms. These periodicities were strongest for ensembles of fibers with CFs > 2 kHz, where there is widespread synchrony of discharges across many fibers. 6. The "dominance region for pitch" was studied with the use of two harmonic complexes consisting of harmonics 3-5 of one F0 and harmonics 6-12 of another fundamental 20% higher in frequency. When the complexes were presented individually, pitch estimates were always close to the fundamental of the complex. When the complexes were presented concurrently, pitch estimates always followed the fundamental of harmonics 3-5 for F0s of 150-480 Hz. For F0s of 125-150 Hz, pitch estimates followed one or the other fundamental, and for F0s < 125 Hz, pitch estimates followed the fundamental of harmonics 6-12. (ABSTRACT TRUNCATED)


2009 ◽  
Vol 101 (6) ◽  
pp. 3169-3191 ◽  
Author(s):  
Heinrich Neubauer ◽  
Christine Köppl ◽  
Peter Heil

In vertebrate auditory systems, the conversion from graded receptor potentials across the hair-cell membrane into stochastic spike trains of the auditory nerve (AN) fibers is performed by ribbon synapses. The statistics underlying this process constrain auditory coding but are not precisely known. Here, we examine the distributions of interspike intervals (ISIs) from spontaneous activity of AN fibers of the barn owl ( Tyto alba), a nocturnal avian predator whose auditory system is specialized for precise temporal coding. The spontaneous activity of AN fibers, with the exception of those showing preferred intervals, is commonly thought to result from excitatory events generated by a homogeneous Poisson point process, which lead to spikes unless the fiber is refractory. We show that the ISI distributions in the owl are better explained as resulting from the action of a brief refractory period (∼0.5 ms) on excitatory events generated by a homogeneous stochastic process where the distribution of interevent intervals is a mixture of an exponential and a gamma distribution with shape factor 2, both with the same scaling parameter. The same model was previously shown to apply to AN fibers in the cat. However, the mean proportions of exponentially versus gamma-distributed intervals in the mixture were different for cat and owl. Furthermore, those proportions were constant across fibers in the cat, whereas they covaried with mean spontaneous rate and with characteristic frequency in the owl. We hypothesize that in birds, unlike in mammals, more than one ribbon may provide excitation to most fibers, accounting for the different proportions, and that variation in the number of ribbons may underlie the variation in the proportions.


2003 ◽  
Vol 182 (1-2) ◽  
pp. 130-139 ◽  
Author(s):  
Donal G Sinex ◽  
Heidi Guzik ◽  
Hongzhe Li ◽  
Jennifer Henderson Sabes

2008 ◽  
Vol 100 (3) ◽  
pp. 1301-1319 ◽  
Author(s):  
Erik Larsen ◽  
Leonardo Cedolin ◽  
Bertrand Delgutte

Pitch differences between concurrent sounds are important cues used in auditory scene analysis and also play a major role in music perception. To investigate the neural codes underlying these perceptual abilities, we recorded from single fibers in the cat auditory nerve in response to two concurrent harmonic complex tones with missing fundamentals and equal-amplitude harmonics. We investigated the efficacy of rate-place and interspike-interval codes to represent both pitches of the two tones, which had fundamental frequency (F0) ratios of 15/14 or 11/9. We relied on the principle of scaling invariance in cochlear mechanics to infer the spatiotemporal response patterns to a given stimulus from a series of measurements made in a single fiber as a function of F0. Templates created by a peripheral auditory model were used to estimate the F0s of double complex tones from the inferred distribution of firing rate along the tonotopic axis. This rate-place representation was accurate for F0s ≳900 Hz. Surprisingly, rate-based F0 estimates were accurate even when the two-tone mixture contained no resolved harmonics, so long as some harmonics were resolved prior to mixing. We also extended methods used previously for single complex tones to estimate the F0s of concurrent complex tones from interspike-interval distributions pooled over the tonotopic axis. The interval-based representation was accurate for F0s ≲900 Hz, where the two-tone mixture contained no resolved harmonics. Together, the rate-place and interval-based representations allow accurate pitch perception for concurrent sounds over the entire range of human voice and cat vocalizations.


2005 ◽  
Vol 94 (1) ◽  
pp. 347-362 ◽  
Author(s):  
Leonardo Cedolin ◽  
Bertrand Delgutte

Harmonic complex tones elicit a pitch sensation at their fundamental frequency (F0), even when their spectrum contains no energy at F0, a phenomenon known as “pitch of the missing fundamental.” The strength of this pitch percept depends upon the degree to which individual harmonics are spaced sufficiently apart to be “resolved” by the mechanical frequency analysis in the cochlea. We investigated the resolvability of harmonics of missing-fundamental complex tones in the auditory nerve (AN) of anesthetized cats at low and moderate stimulus levels and compared the effectiveness of two representations of pitch over a much wider range of F0s (110–3,520 Hz) than in previous studies. We found that individual harmonics are increasingly well resolved in rate responses of AN fibers as the characteristic frequency (CF) increases. We obtained rate-based estimates of pitch dependent upon harmonic resolvability by matching harmonic templates to profiles of average discharge rate against CF. These estimates were most accurate for F0s above 400–500 Hz, where harmonics were sufficiently resolved. We also derived pitch estimates from all-order interspike-interval distributions, pooled over our entire sample of fibers. Such interval-based pitch estimates, which are dependent on phase-locking to the harmonics, were accurate for F0s below 1,300 Hz, consistent with the upper limit of the pitch of the missing fundamental in humans. The two pitch representations are complementary with respect to the F0 range over which they are effective; however, neither is entirely satisfactory in accounting for human psychophysical data.


Sign in / Sign up

Export Citation Format

Share Document