analysis by synthesis
Recently Published Documents


TOTAL DOCUMENTS

219
(FIVE YEARS 11)

H-INDEX

18
(FIVE YEARS 2)

2021 ◽  
Vol 11 (5) ◽  
pp. 1990
Author(s):  
Vinod Devaraj ◽  
Philipp Aichinger

The characterization of voice quality is important for the diagnosis of a voice disorder. Vocal fry is a voice quality which is traditionally characterized by a low frequency and a long closed phase of the glottis. However, we also observed amplitude modulated vocal fry glottal area waveforms (GAWs) without long closed phases (positive group) which we modelled using an analysis-by-synthesis approach. Natural and synthetic GAWs are modelled. The negative group consists of euphonic, i.e., normophonic GAWs. The analysis-by-synthesis approach fits two modelled GAWs for each of the input GAW. One modelled GAW is modulated to replicate the amplitude and frequency modulations of the input GAW and the other modelled GAW is unmodulated. The modelling errors of the two modelled GAWs are determined to classify the GAWs into the positive and the negative groups using a simple support vector machine (SVM) classifier with a linear kernel. The modelling errors of all vocal fry GAWs obtained using the modulating model are smaller than the modelling errors obtained using the unmodulated model. Using the two modelling errors as predictors for classification, no false positives or false negatives are obtained. To further distinguish the subtypes of amplitude modulated vocal fry GAWs, the entropy of the modulator’s power spectral density and the modulator-to-carrier frequency ratio are obtained.


2021 ◽  
Vol 44 ◽  
Author(s):  
Robert M. Gordon

Abstract The target article presents strong empirical evidence that knowledge is basic. However, it offers an unsatisfactory account of what makes knowledge basic. Some current ideas in cognitive neuroscience – predictive coding and analysis by synthesis – point to a more plausible account that better explains the evidence.


2020 ◽  
Vol 6 (10) ◽  
pp. eaax5979 ◽  
Author(s):  
Ilker Yildirim ◽  
Mario Belledonne ◽  
Winrich Freiwald ◽  
Josh Tenenbaum

Vision not only detects and recognizes objects, but performs rich inferences about the underlying scene structure that causes the patterns of light we see. Inverting generative models, or “analysis-by-synthesis”, presents a possible solution, but its mechanistic implementations have typically been too slow for online perception, and their mapping to neural circuits remains unclear. Here we present a neurally plausible efficient inverse graphics model and test it in the domain of face recognition. The model is based on a deep neural network that learns to invert a three-dimensional face graphics program in a single fast feedforward pass. It explains human behavior qualitatively and quantitatively, including the classic “hollow face” illusion, and it maps directly onto a specialized face-processing circuit in the primate brain. The model fits both behavioral and neural data better than state-of-the-art computer vision models, and suggests an interpretable reverse-engineering account of how the brain transforms images into percepts.


2019 ◽  
Author(s):  
Li Zhaoping

Visual attention selects only a tiny fraction of visual input informationfor further processing. Selection starts in the primary visual cortex (V1), which creates abottom-up saliency map to guide the fovea to selected visual locations via gaze shifts.This motivates a new framework that views visionas consisting of encoding, selection, and decoding stages, placingselection on center stage. It suggests a massive loss of non-selectedinformation from V1 downstream along the visual pathway.Hence, feedback from downstream visual cortical areas to V1 for better decoding (recognition),through analysis-by-synthesis, should query for additional information and be mainly directed atthe foveal region. Accordingly, non-foveal vision is not only poorer in spatial resolution,but also more susceptible to many illusions.


2019 ◽  
Author(s):  
Sankar Mukherjee ◽  
Alice Tomassini ◽  
Leonardo Badino ◽  
Aldo Pastore ◽  
Luciano Fadiga ◽  
...  

AbstractCortical entrainment to the (quasi-) rhythmic components of speech seems to play an important role in speech comprehension. It has been suggested that neural entrainment may reflect top-down temporal predictions of sensory signals. Key properties of a predictive model are its anticipatory nature and its ability to reconstruct missing information. Here we put both these two properties to experimental test. We acoustically presented sentences and measured cortical entrainment to both acoustic speech envelope and lips kinematics acquired from the speaker but not visible to the participants. We then analyzed speech-brain and lips-brain coherence at multiple negative and positive lags. Besides the well-known cortical entrainment to the acoustic speech envelope, we found significant entrainment in the delta range to the (latent) lips kinematics. Most interestingly, the two entrainment phenomena were temporally dissociated. While entrainment to the acoustic speech peaked around +0.3 s lag (i.e., when EEG followed speech by 0.3 s), entrainment to the lips was significantly anticipated and peaked around 0-0.1 s lag (i.e., when EEG was virtually synchronous to the putative lips movement). Our results demonstrate that neural entrainment during speech listening involves the anticipatory reconstruction of missing information related to lips movement production, indicating its fundamentally predictive nature and thus supporting analysis by synthesis models.


Sign in / Sign up

Export Citation Format

Share Document