scholarly journals Encoding of vocal pitch in the dorsal premotor cortex during multi-talker speech recognition

2021 ◽  
Author(s):  
Jonathan Henry Venezia ◽  
Christian Herrera ◽  
Nicole Whittle ◽  
Marjorie R. Leek ◽  
Samuel Barnes ◽  
...  

In a recent study (Venezia et al., 2021), left dorsal premotor cortex (dPM) responded to vocal pitch during a degraded speech recognition task, but only when speech was rated as unintelligible. Crucially, vocal pitch was not relevant to the task. The present fMRI study (N = 25) tests the hypothesis that left dPM will respond to vocal pitch for increasingly intelligible speech in a multi-talker speech recognition task that emphasizes pitch for talker segregation. We applied spectrotemporal modulation distortion to independently modulate vocal pitch and phonetic content in two-talker (male/female) utterances across two conditions (Competing, Unison), only one of which required pitch-based segregation (Competing). A Bayesian hierarchical drift-diffusion model (HDDM) was used to predict speech recognition performance (3-AFC response times, accuracy coded) from the pattern of spectrotemporal distortion imposed on each trial. The model’s drift rate parameter, a d’-like measure of speech recognition performance, was strongly associated with vocal pitch for Competing but not Unison. In a second, Bayesian hierarchical brain-behavior model, we then regressed the HDDM’s posterior predictions of trial-wise drift rate against trial-wise fMRI activation amplitude. A significant positive association with overall drift rate, reflecting contributions from vocal pitch and/or phonetic content, was observed in left dPM in both conditions. A significant positive association with ‘pitch-restricted’ drift rate, reflecting only contributions from vocal pitch, was observed in left dPM but only in the Competing condition. These findings suggest that left dPM: (i) responds to vocal pitch; and (ii) can operate in an auditory-pitch mode and a phonetic-speech mode.

Author(s):  
Omar Farooq ◽  
Sekharjit Datta

The area of speech recognition has been thoroughly researched during the past fifty years; however, robustness is still an important challenge to overcome. It has been established that there exists a correlation between speech produced and lip motion which is helpful in the adverse background conditions to improve the recognition performance. This chapter presents main components used in audio-visual speech recognition systems. Results of a prototype experiment conducted on audio-visual corpora for Hindi speech have been reported of simple phoneme recognition task. The chapter also addresses some of the issues related to visual feature extraction and the integration of audio-visual and finally present future research directions.


Author(s):  
ESTHER LEVIN ◽  
ROBERTO PIERACCINI ◽  
ENRICO BOCCHIERI

Recently, much interest has been generated regarding speech recognition systems based on Hidden Markov Models (HMMs) and neural network (NN) hybrids. Such systems attempt to combine the best features of both models: the temporal structure of HMMs and the discriminative power of neural networks. In this work we establish one more relation between the HMM and the NN paradigms by introducing the time-warping network (TWN) that is a generalization of both an HMM-based recognizer and a backpropagation net. The basic element of such a network, a time- warping neuron, extends the operation of the formal neuron of a backpropagation network by warping the input pattern to match it optimally to its weights. We show that a single-layer network of TW neurons is equivalent to a Gaussian density HMM-based recognition system. This equivalent neural representation suggests ways to improve the discriminative power of this system by using backpropagation discriminative training, and/or by generalizing the structure of the recognizer to a multi-layer net. The performance of the proposed network was evaluated on a highly confusable, isolated word, multi-speaker recognition task. The results indicate that not only does the recognition performance improve, but the separation between classes is enhanced, allowing us to set up a rejection criterion to improve the confidence of the system.


Author(s):  
Denis Ivanko ◽  
Dmitry Ryumin

In this paper we design end-to-end neural network for the low-resource lip-reading task and audio speech recognition task using 3D CNNs, pre-trained CNN weights of several state-of- the-art models (e.g. VGG19, InceptionV3, MobileNetV2, etc.) and LSTMs. We present two phrase-level speech recognition pipelines: for lip-reading and acoustic speech recognition. We evaluate different combinations of front-end and back-end modules on the RUSAVIC dataset. We compare our results with traditional 2D CNN approach and demonstrate the increase in recognition accuracy up to 14%. Moreover, we carefully studied existing state-of-the-art models to be use for augmentation. Based on the conducted analysis we have chosen 5 most promising model’s architectures and evaluated them on own data. We have tested our systems on a real-word data of two different scenarios: recorded in idling vehicle and during actual driving. Our independently trained systems demonstrated acoustic speech accuracy up to 90% and lip-reading accuracy up to 61%. Future work will focus on the fusion of visual and audio speech modalities and on speaker adaptation. We expect that fused multi-modal information will help to further improve recognition performance compared to a single modality. Another possible direction could be the research of different NN-based architectures to better tackle end-to-end lip-reading task.


2019 ◽  
Vol 62 (4) ◽  
pp. 1051-1067 ◽  
Author(s):  
Jonathan H. Venezia ◽  
Allison-Graham Martin ◽  
Gregory Hickok ◽  
Virginia M. Richards

Purpose Age-related sensorineural hearing loss can dramatically affect speech recognition performance due to reduced audibility and suprathreshold distortion of spectrotemporal information. Normal aging produces changes within the central auditory system that impose further distortions. The goal of this study was to characterize the effects of aging and hearing loss on perceptual representations of speech. Method We asked whether speech intelligibility is supported by different patterns of spectrotemporal modulations (STMs) in older listeners compared to young normal-hearing listeners. We recruited 3 groups of participants: 20 older hearing-impaired (OHI) listeners, 19 age-matched normal-hearing listeners, and 10 young normal-hearing (YNH) listeners. Listeners performed a speech recognition task in which randomly selected regions of the speech STM spectrum were revealed from trial to trial. The overall amount of STM information was varied using an up–down staircase to hold performance at 50% correct. Ordinal regression was used to estimate weights showing which regions of the STM spectrum were associated with good performance (a “classification image” or CImg). Results The results indicated that (a) large-scale CImg patterns did not differ between the 3 groups; (b) weights in a small region of the CImg decreased systematically as hearing loss increased; (c) CImgs were also nonsystematically distorted in OHI listeners, and the magnitude of this distortion predicted speech recognition performance even after accounting for audibility; and (d) YNH listeners performed better overall than the older groups. Conclusion We conclude that OHI/older normal-hearing listeners rely on the same speech STMs as YNH listeners but encode this information less efficiently. Supplemental Material https://doi.org/10.23641/asha.7859981


GeroPsych ◽  
2019 ◽  
Vol 32 (3) ◽  
pp. 125-134
Author(s):  
Mechthild Niemann-Mirmehdi ◽  
Andreas Häusler ◽  
Paul Gellert ◽  
Johanna Nordheim

Abstract. To date, few studies have focused on perceived overprotection from the perspective of people with dementia (PwD). In the present examination, the association of perceived overprotection in PwD is examined as an autonomy-restricting factor and thus negative for their mental well-being. Cross-sectional data from the prospective DYADEM study of 82 patient/partner dyads (mean age = 74.26) were used to investigate the association between overprotection, perceived stress, depression, and quality of life (QoL). The analyses show that an overprotective contact style with PwD has a significant positive association with stress and depression, and has a negative association with QoL. The results emphasize the importance of avoiding an overprotective care style and supporting patient autonomy.


2019 ◽  
Vol 64 (1) ◽  
pp. 5-15
Author(s):  
Christos Kollias ◽  
Panayiotis Tzeremes

Abstract The economic and social drivers of democratisation and the emergence and establishment of democratic institutions are longstanding themes of academic discourse. Within this broad body of literature, it has been argued that the process of urbanisation is also conducive to the emergence and consolidation of democracy through a number of different channels. Cities offer better access to education and facilitate organised public action and the demand for more democratic rule and respect of human rights. The nexus between urbanisation and human rights is the theme that is taken up in the present paper. Using a sample of 123 countries for the period 1981–2011, the paper examines empirically the association between urbanisation and human empowerment using the Cingranelli-Richards Index. In broad terms, the findings reported herein do not point to a strong nexus across all income groups. Nevertheless, there is evidence suggesting the presence of such a statistically significant positive association in specific cases.


2020 ◽  
Vol 15 (6) ◽  
pp. 1061-1082 ◽  
Author(s):  
Merve Acar ◽  
Hüseyin Temiz

PurposeThe purpose of this study is to investigate the association between environmental performance of firms and the level of voluntary environmental disclosure in emerging markets.Design/methodology/approachWe used tobit regression OLS and t-test methods to reveal the association between environmental performance and the level of voluntary environmental disclosure.FindingsWe find a significant positive association between the level of discretionary environmental disclosures and corporate environmental performance. The result is in line with the arguments of economics disclosure theory that argues environmentally good performers disclose more.Practical implicationsMany of the environmentally good firms in Turkey are also listed in the “BIST Sustainability Index,” and this situation can be the result of the relative power of external regulations. Accordingly, it can be suggested to increase the community and governmental pressures for environmental reporting but also gives importance to increase intrinsic motivations for companies to engage in disclosure practices.Originality/valueThis study shed light on relation between environmental performance and environmental disclosure in an emerging market context. Also, it is revisited that the relation between environmental performance and the level of environmental disclosure by testing two different predictions on the level of environmental disclosures.


Medicina ◽  
2019 ◽  
Vol 55 (8) ◽  
pp. 458 ◽  
Author(s):  
Bonanni ◽  
Gualtieri ◽  
Lester ◽  
Falcone ◽  
Nardella ◽  
...  

Background and Objectives: At present, data collected from the literature about suicide and anhedonia are controversial. Some studies have shown that low levels of anhedonia are associated with serious suicide attempts and death by suicide, while other studies have shown that high levels of anhedonia are associated with suicide. Materials and Methods: For this review, we searched PubMed, Medline, and ScienceDirect for clinical studies published from 1 January 1990 to 31 December 2018 with the following search terms used in the title or in the abstract: “anhedonia AND suicid*.” We obtained a total of 155 articles; 133 items were excluded using specific exclusion criteria, the remaining 22 articles included were divided into six groups based on the psychiatric diagnosis: mood disorders, schizophrenia spectrum disorders, post-traumatic stress disorder (PTSD), other diagnoses, attempted suicides, and others (healthy subjects). Results: The results of this review reveal inconsistencies. Some studies reported that high anhedonia scores were associated with suicidal behavior (regardless of the diagnosis), while other studies found that low anhedonia scores were associated with suicidal behavior, and a few studies reported no association. The most consistent association between anhedonia and suicidal behavior was found for affective disorders (7 of 7 studies reported a significant positive association) and for PTSD (3 of 3 studies reported a positive association). In the two studies of patients with schizophrenia, one found no association, and one found a negative association. For patients who attempted suicide (undiagnosed), one study found a positive association, one a positive association only for depressed attempters, and one a negative association. Conclusions: We found the most consistent positive association for patients with affective disorders and PTSD, indicating that the assessment of anhedonia may be useful in the evaluation of suicidal risk.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2169
Author(s):  
Georgia Karpathiou ◽  
Maroa Dridi ◽  
Lila Krebs-Drouot ◽  
François Vassal ◽  
Emmanuel Jouanneau ◽  
...  

Chordomas are notably resistant to chemotherapy. One of the cytoprotective mechanisms implicated in chemoresistance is autophagy. There are indirect data that autophagy could be implicated in chordomas, but its presence has not been studied in chordoma tissues. Sixty-one (61) chordomas were immunohistochemically studied for autophagic markers and their expression was compared with the expression in notochords, clinicopathological data, as well as the tumor immune microenvironment. All chordomas strongly and diffusely expressed cytoplasmic p62 (sequestosome 1, SQSTM1/p62), whereas 16 (26.2%) tumors also showed nuclear p62 expression. LC3B (Microtubule-associated protein 1A/1B-light chain 3B) tumor cell expression was found in 44 (72.1%) tumors. Autophagy-related 16‑like 1 (ATG16L1) was also expressed by most tumors. All tumors expressed mannose-6-phosphate/insulin-like growth factor 2 receptor (M6PR/IGF2R). LC3B tumor cell expression was negatively associated with tumor size, while no other parameters, such as age, sex, localization, or survival, were associated with the immunohistochemical factors studied. LC3B immune cell expression showed a significant positive association with programmed death-ligand 1 (PD-L1)+ immune cells and with a higher vascular density. ATG16L1 expression was also positively associated with higher vascular density. Notochords (n = 5) showed different immunostaining with a very weak LC3B and M6PR expression, and no p62 expression. In contrast to normal notochords, autophagic factors such as LC3B and ATG16L1 are often present in chordomas, associated with a strong and diffuse expression of p62, suggesting a blocked autophagic flow. Furthermore, PD-L1+ immune cells also express LC3B, suggesting the need for further investigations between autophagy and the immune microenvironment.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 1007
Author(s):  
Chi Xu ◽  
Yunkai Jiang ◽  
Jun Zhou ◽  
Yi Liu

Hand gesture recognition and hand pose estimation are two closely correlated tasks. In this paper, we propose a deep-learning based approach which jointly learns an intermediate level shared feature for these two tasks, so that the hand gesture recognition task can be benefited from the hand pose estimation task. In the training process, a semi-supervised training scheme is designed to solve the problem of lacking proper annotation. Our approach detects the foreground hand, recognizes the hand gesture, and estimates the corresponding 3D hand pose simultaneously. To evaluate the hand gesture recognition performance of the state-of-the-arts, we propose a challenging hand gesture recognition dataset collected in unconstrained environments. Experimental results show that, the gesture recognition accuracy of ours is significantly boosted by leveraging the knowledge learned from the hand pose estimation task.


Sign in / Sign up

Export Citation Format

Share Document