scholarly journals Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion

Author(s):  
Suzhen Wang ◽  
Lincheng Li ◽  
Yu Ding ◽  
Changjie Fan ◽  
Xin Yu

We propose an audio-driven talking-head method to generate photo-realistic talking-head videos from a single reference image. In this work, we tackle two key challenges: (i) producing natural head motions that match speech prosody, and (ii)} maintaining the appearance of a speaker in a large head motion while stabilizing the non-face regions. We first design a head pose predictor by modeling rigid 6D head movements with a motion-aware recurrent neural network (RNN). In this way, the predicted head poses act as the low-frequency holistic movements of a talking head, thus allowing our latter network to focus on detailed facial movement generation. To depict the entire image motions arising from audio, we exploit a keypoint based dense motion field representation. Then, we develop a motion field generator to produce the dense motion fields from input audio, head poses, and a reference image. As this keypoint based representation models the motions of facial regions, head, and backgrounds integrally, our method can better constrain the spatial and temporal consistency of the generated videos. Finally, an image generation network is employed to render photo-realistic talking-head videos from the estimated keypoint based motion fields and the input reference image. Extensive experiments demonstrate that our method produces videos with plausible head motions, synchronized facial expressions, and stable backgrounds and outperforms the state-of-the-art.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Magdalena Janc ◽  
Mariola Sliwinska-Kowalska ◽  
Piotr Politanski ◽  
Marek Kaminski ◽  
Magdalena Jozefowicz-Korczynska ◽  
...  

AbstractThe aim of our study was to validate the method of head-shake static posturography (HS-posturography) in healthy individuals and to establish the value of this novel method in the diagnostics of patients with unilateral vestibular lesion (UV). The study included 202 participants divided into two groups, one consisting of 133 patients with canal paresis CP > 19% and one of 69 healthy subjects. Participant was tested according to the standard protocol of static posturography (SP), and with head movements of 0.3 Hz (HS 40), 0.6 Hz (HS 70) in random order controlled by a metronome. HS-posturography revealed a similar repeatability and internal consistency as the standard posturography. In patients with UV, 4th condition revealed higher sensitivity (74%) and specificity (71%) in HS 40 than in the standard posturography (67%, 65% respectively) and HS 70 (54%, 70% respectively). Static posturography and HS- posturography revealed a high reliability of the testing method. The head movements added to static posturography improve the sensitivity and specificity of the method in group with vestibular impairment. The most important test for that purpose seems to be the one on unstable surface with the eyes closed, with low frequency of head movements.


2001 ◽  
Vol 11 (1) ◽  
pp. 3-12
Author(s):  
Ji Soo Kim ◽  
James A. Sharpe

The effects of aging on the vertical vestibulo-ocular reflex (VOR), and its interactions with vision during active head motion had not been investigated. We measured smooth pursuit, combined eye-head tracking, the VOR, and its visual enhancement and cancellation during active head motion in pitch using a magnetic search coil technique in 21 younger (age < 65) and 10 elderly (age ⩾ 65) subjects. With the head immobile, subjects pursued a target moving sinusoidally with a frequency range of 0.125 to 2.0 Hz, and with peak target accelerations (PTAs) ranging from 12 to 789Âř/s 2 . Combined eye-head tracking, the VOR in darkness, and its visual enhancement during fixation of an earth-fixed target (VVOR) were measured during active sinusoidal head motion with a peak-to-peak amplitude of 20Âř at frequencies of 0.25, 0.5, 1.0 and 2.0 Hz. The efficacy of VOR cancellation was determined from VOR gains during combined eye-head tracking. VOR and VVOR gains were symmetrical in both directions and did not change with aging, except for reduced gains of the downward VOR and VVOR at low frequency (0.25 Hz). However, in the elderly, smooth pursuit, and combined eye-head tracking gains and the efficacy of cancellation of the VOR were significantly lower than in younger subjects. In both the young and elderly groups, VOR gain in darkness did not vary with the frequency of active head motion while the gains of smooth pursuit, combined eye-head tracking, and VVOR declined with increasing target frequency. VOR and VVOR performance in the elderly implicates relative preservation of neural structures subserving vertical vestibular smooth eye motion in senescence.


2006 ◽  
Vol 16 (1-2) ◽  
pp. 29-33
Author(s):  
Kim R. Gottshall ◽  
Michael E. Hoffer ◽  
Helen S. Cohen ◽  
Robert J. Moore

Study design: Four groups, between-subjects study. Objectives: To investigate the effects of exercise on adaptation of normal subjects who had been artificially spatially disoriented. Background: Many patients referred for rehabilitation experience sensory changes, due to age or disease processes, and these changes affect motor skill. The best way to train patients to adapt to these changes and to improve their sensorimotor skills is unclear. Using normal subjects, we tested the hypothesis that active, planned head movement is needed to adapt to modified visual input. Methods and measures: Eighty male and female subjects who had normal balance on computerized dynamic posturography (CDP) and the dynamic gait index (DGI), were randomly assigned to four groups. All groups donned diagonally shift lenses and were again assessed with CDP and DGI. The four groups were then treated for 20 min. Group 1 (control group) viewed a video, Group 2 performed exercise that involved translating the entire body through space, but without separate, volitional head movement, Group 3 performed exercises which all incorporated volitional, planned head rotations, and Group 4 performed exercises that involved translating the body (as in Group 2) and incorporated volitional, planned head motion (as in Group 3). All subjects were post-tested with CDP and DGI, lenses were removed, and subjects were retested again with CDP and DGI. Results: The groups did not differ significantly on CDP scores but Groups 3 and 4 had significantly better DGI scores than Groups 1 and 2. Conclusions: Active head movement that is specifically planned as part of the exercise is more effective than passive attention or head movements that are not consciously planned, for adapting to sensorimotor change when it incorporates active use of the changed sensory modality, in this case head motion.


2010 ◽  
Vol 1 (1) ◽  
Author(s):  
Takuma Otsuka ◽  
Kazuhiro Nakadai ◽  
Toru Takahashi ◽  
Kazunori Komatani ◽  
Tetsuya Ogata ◽  
...  

AbstractThis paper presents voice-awareness control consistent with robot’s head movements. For a natural spoken communication between robots and humans, robots must behave and speak the way humans expect them to. The consistency between the robot’s voice quality and its body motion is one of the most especially striking factors in naturalness of robot speech. Our control is based on a new model of spectral envelope modification for vertical head motion, and left-right balance modulation for horizontal head motion. We assume that a pitch-axis rotation, or a vertical head motion, and a yaw-axis rotation, or a horizontal head motion, effect the voice quality independently. The spectral envelope modification model is constructed based on the analysis of human vocalizations. The left-right balance model is established by measuring impulse responses using a pair of microphones. Experimental results show that the voice-awareness is perceivable in a robot-to-robot dialogue when the robots stand up to 150 cm away. The dynamic change in the voice quality is also confirmed in the experiment.


2020 ◽  
pp. 108705472091198
Author(s):  
Phoebe Thomson ◽  
Katherine A. Johnson ◽  
Charles B. Malpas ◽  
Daryl Efron ◽  
Emma Sciberras ◽  
...  

Objective: To characterize head movements in children with ADHD using an ex-Gaussian distribution and examine associations with out-of-scanner sustained attention. Method: Fifty-six children with ADHD and 61 controls aged 9 to 11 years completed the Sustained Attention to Response Task (SART) and resting-state functional magnetic resonance imaging (fMRI). In-scanner head motion was calculated using ex-Gaussian estimates for mu, sigma, and tau in delta variation signal and framewise displacement. Sustained attention was evaluated through omission errors and tau in response time on the SART. Results: Mediation analysis revealed that out-of-scanner attention lapses (omissions during the SART) mediated the relationship between ADHD diagnosis and in-scanner head motion (tau in delta variation signal), indirect effect: B = 1.29, 95% confidence interval (CI) = [0.07, 3.15], accounting for 29% of the association. Conclusion: Findings suggest a critical link between trait-level sustained attention and infrequent large head movements during scanning (tau in head motion) and highlight fundamental challenges in measuring the neural basis of sustained attention.


2013 ◽  
Vol 275-277 ◽  
pp. 1403-1406
Author(s):  
Zheng Ru Tao ◽  
Xia Xin Tao

In seismic analysis of large span bridge, inconsistent ground motions in three directions, lengthwise, lateral and vertical are required to input at the base of each of the two main girder piers. In order to adopt synthesized motion field for the inputs, a simple way to prepare the vertical motion is introduced for improvisation at this moment in this paper, since the synthesis in general consists of two parts, the low frequency ground motion calculated by a numerical method, like FEM, and the high frequency motion synthesized by random approach, and the result of the former is in three dimensional, while that of the latter has just horizontal component. The vertical acceleration time histories proposed in the paper show the way is available.


Sign in / Sign up

Export Citation Format

Share Document