On cross-language experiments and data-driven units for ALISP (Automatic Language Independent Speech Processing)

Existing data indicate that cortical speech processing is hierarchically organized. Numerous studies have shown that early auditory areas encode fine acoustic details while later areas encode abstracted speech patterns. However, it remains unclear precisely what speech information is encoded across these hierarchical levels. Estimation of speech-driven spectrotemporal receptive fields (STRFs) provides a means to explore cortical speech processing in terms of acoustic or linguistic information associated with characteristic spectrotemporal patterns. Here, we estimate STRFs from cortical responses to continuous speech in fMRI. Using a novel approach based on filtering randomly-selected spectrotemporal modulations (STMs) from aurally-presented sentences, STRFs were estimated for a group of listeners and categorized using a data-driven clustering algorithm. ‘Behavioral STRFs’ highlighting STMs crucial for speech recognition were derived from intelligibility judgments. Clustering revealed that STRFs in the supratemporal plane represented a broad range of STMs, while STRFs in the lateral temporal lobe represented circumscribed STM patterns important to intelligibility. Detailed analysis recovered a bilateral organization with posterior-lateral regions preferentially processing STMs associated with phonological information and anterior-lateral regions preferentially processing STMs associated with word- and phrase-level information. Regions in lateral Heschl’s gyrus preferentially processed STMs associated with vocalic information (pitch).

Download Full-text

Capturing Cross-linguistic Differences in Macro-rhythm: The Case of Italian and English

Language and Speech ◽

10.1177/0023830919835849 ◽

2019 ◽

Vol 63 (2) ◽

pp. 242-263

Author(s):

Leona Polyanskaya ◽

Maria Grazia Busà ◽

Mikhail Ordin

Keyword(s):

Language Acquisition ◽

Speech Processing ◽

Speech Rhythm ◽

Language Differences ◽

Linguistic Differences ◽

Language Difference ◽

Relative Load ◽

Higher Regularity ◽

Cross Language

We tested the hypothesis that languages can be classified by their degree of tonal rhythm (Jun, 2014). The tonal rhythms of English and Italian were quantified using the following parameters: (a) regularity of tonal alternations in time, measured as durational variability in peak-to-peak and valley-to-valley intervals; (b) magnitude of F0 excursions, measured as the range of frequencies covered by the speaker between consecutive F0 maxima and minima; (c) number of tonal target points per intonational unit; and (d) similarity of F0 rising and falling contours within intonational units. The results show that, as predicted by Jun’s prosodic typology (2014), Italian has a stronger tonal rhythm than English, expressed by higher regularity in the distribution of F0 minima turning points, larger F0 excursions, and more frequent tonal targets, indicating alternating phonological H and L tones. This cross-language difference can be explained by the relative load of F0 and durational ratios on the perception and production of speech rhythm and prominence. We suggest that research on the role of speech rhythm in speech processing and language acquisition should not be restricted to syllabic rhythm, but should also examine the role of cross-language differences in tonal rhythm.

Download Full-text

Integrating hypothesis- and data-driven approaches in speech and language neuroscience

10.31234/osf.io/8vwcs ◽

2019 ◽

Cited By ~ 1

Author(s):

Sophie Bouton ◽

Valerian Chambon ◽

Narly Golestani ◽

Anne-Lise Giraud

Keyword(s):

Speech Processing ◽

Data Driven ◽

Speech And Language ◽

Time Resolved ◽

Brain Functions ◽

Information Transformation ◽

Very Large Datasets ◽

Model Spaces ◽

System Data ◽

Using Data

Brain functions are ever more explored using data-driven methods, which allow to work with very large datasets collected in relatively natural experimental settings. However, like hypothesis-driven approaches, data-driven methods do not come without drawbacks, and pose interpretation problems, particularly in cognitive domains such as speech and language, where temporal processing is a key component. While hypothesis-driven methods explicitly address speech processing as a hierarchical system, data-driven approaches probe speech processing as a system that can flexibly combine multiple and distributed features. Given the disparity of available methods and underlying concepts, synthesizing the results of hypothesis- and data-driven experiments represents a substantial challenge. Taking a number of influential examples in the recent speech and language literature, we unpack advantages and limitations of both approaches, and highlight ways in which they can be fruitfully combined, for example by using time-resolved analyses, by applying specific models at each level of information transformation, or more generally by complementing data-driven, exploratory approaches with analysis methods that question the data within more constrained model-spaces.

Download Full-text

Bangla Speech Analysis, Synthesis, and Vowel Nasality

Technical Challenges and Design Issues in Bangla Language Processing ◽

10.4018/978-1-4666-3970-6.ch010 ◽

2013 ◽

pp. 209-245

Author(s):

Shahina Haque

Keyword(s):

Speech Production ◽

Distinctive Feature ◽

Speech Processing ◽

State Of The Art ◽

The State ◽

Speech Analysis ◽

Speech Synthesizer ◽

Language Perception ◽

Analysis And Synthesis ◽

Cross Language

The chapter provides an overview of the theory of speech production, analysis, and synthesis, and status of Bangla speech processing. As nasality is a distinctive feature of Bangla and all the vowels have their nasal counterpart, both Bangla vowels and nasality are also considered. The chapter reviews the state-of-the-art of nasal vowel research, cross language perception of vowel nasality, and vowel nasality transformation to be used in a speech synthesizer.

Download Full-text

Native and Cross-Language Speech Sounds: Some Perceptual Processes

Perceptual and Motor Skills ◽

10.2466/pms.1991.73.1.227 ◽

1991 ◽

Vol 73 (1) ◽

pp. 227-234

Author(s):

Minola A. Pinard

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Speech Sounds ◽

Developmental Approach ◽

Complex Processes ◽

Different Ages ◽

Cross Language

Using a developmental approach, two aspects of debate in the speech perception literature were tested, (a) the nature of adult speech processing, the dichotomy being along nonlinguistic versus linguistic lines, and (b) the nature of speech processing by children of different ages, the hypotheses here implying in infancy detector-like processes and at age four “adult-like” speech perception reorganizations. Children ranging in age from 4 up to 18 years discriminated native and foreign speech contrasts. Results confirm the hypotheses for adults. It is clear that different processes are operating at different ages; however, more complex processes may come into play around the ages of 6 to 10 years; boys may use different strategies than girls, and with age, a multiplicity of processes may be concurrently active.

Download Full-text

Data-driven methods reveal the generalizing mechanisms of speech processing in naturally varying soundscapes

10.32470/ccn.2018.1136-0 ◽

2018 ◽

Author(s):

Moritz Boos ◽

Jörg Lücke ◽

Jochem Rieger

Keyword(s):

Speech Processing ◽

Data Driven

Download Full-text

Age-Related Speech Processing Decline May Account for Comprehension Problems

ASHA Leader ◽

10.1044/leader.rib2.22012017.15 ◽

2017 ◽

Vol 22 (1) ◽

pp. 15-15

Keyword(s):

Speech Processing ◽

Age Related

Download Full-text

What Is Left Is Right

European Psychologist ◽

10.1027/1016-9040.14.1.78 ◽

2009 ◽

Vol 14 (1) ◽

pp. 78-89 ◽

Cited By ~ 8

Author(s):

Kenneth Hugdahl ◽

René Westerhausen

Keyword(s):

Speech Processing ◽

Information Transfer ◽

Hemispheric Asymmetry ◽

Functional Neuroimaging ◽

Gray Matter Volume ◽

Functional Anatomy ◽

Posterior Part ◽

Functional Asymmetry ◽

Planum Temporale ◽

Evolution Of Speech

The present paper is based on a talk on hemispheric asymmetry given by Kenneth Hugdahl at the Xth European Congress of Psychology, Praha July 2007. Here, we propose that hemispheric asymmetry evolved because of a left hemisphere speech processing specialization. The evolution of speech and the need for air-based communication necessitated division of labor between the hemispheres in order to avoid having duplicate copies in both hemispheres that would increase processing redundancy. It is argued that the neuronal basis of this labor division is the structural asymmetry observed in the peri-Sylvian region in the posterior part of the temporal lobe, with a left larger than right planum temporale area. This is the only example where a structural, or anatomical, asymmetry matches a corresponding functional asymmetry. The increase in gray matter volume in the left planum temporale area corresponds to a functional asymmetry of speech processing, as indexed from both behavioral, dichotic listening, and functional neuroimaging studies. The functional anatomy of the corpus callosum also supports such a view, with regional specificity of information transfer between the hemispheres.

Download Full-text

Psychometric Parameters of the Spanish Version of the Kuwait University Anxiety Scale (S-KUAS)1

European Journal of Psychological Assessment ◽

10.1027/1015-5759.20.4.349 ◽

2004 ◽

Vol 20 (4) ◽

pp. 349-357 ◽

Cited By ~ 28

Author(s):

Ahmed M. Abdel-Khalek ◽

Joaquin Tomás-Sabádo ◽

Juana Gómez-Benito

Keyword(s):

Reliability And Validity ◽

Anxiety Scale ◽

Spanish Version ◽

Good Reliability ◽

Kuwait University ◽

Males And Females ◽

Anxiety Levels ◽

Arabic And English ◽

Cross Language ◽

Language Equivalence

Summary: To construct a Spanish version of the Kuwait University Anxiety Scale (S-KUAS), the Arabic and English versions of the KUAS have been separately translated into Spanish. To check the comparability in terms of meaning, the two Spanish preliminary translations were thoroughly scrutinized vis-à-vis both the Arabic and English forms by several experts. Bilingual subjects served to explore the cross-language equivalence of the English and Spanish versions of the KUAS. The correlation between the total scores on both versions was .93, and the t value was .30 (n.s.), denoting good similarity. The Alphas and 4-week test-retest reliabilities were greater than .84, while the criterion-related validity was .70 against scores on the trait subscale of the STAI. These findings denote good reliability and validity of the S-KUAS. Factor analysis yielded three high-loaded factors of Behavioral/Subjective, Cognitive/Affective, and Somatic Anxiety, equivalent to the original Arabic version. Female (n = 210) undergraduates attained significantly higher mean scores than their male (n = 102) counterparts. For the combined group of males and females, the correlation between the total score on the S-KUAS and age was -.17 (p < .01). By and large, the findings of the present study provide evidence of the utility of the S-KUAS in assessing trait anxiety levels in the Spanish undergraduate context.

Download Full-text

Supplemental Material for Toward the Data-Driven Dissemination of Findings From Psychological Science

American Psychologist ◽

10.1037/amp0000721.supp ◽

2020 ◽

Keyword(s):

Data Driven ◽

Psychological Science

Download Full-text