lip synchronization
Recently Published Documents


TOTAL DOCUMENTS

60
(FIVE YEARS 13)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Zhe Xu

The 3D lip synchronization is one of the hot topics and difficulties in the field of computer graphics. How to carry out 3D lip synchronization effectively and accurately is an important research direction in the field of multimedia. On this basis, a comprehensive weighted algorithm is introduced in this paper to sort out the related laws and the time of lip pronunciation in animation multimedia, carry out the vector weight analysis on the texts in the animation multimedia, and synthesize a matching evaluation model for 3D lip synchronization. At the same time, the goal of simultaneous evaluation can be achieved by synthesizing the transitional mouth pattern sequence between consecutive mouth patterns. The results of the simulation experiment indicate that the comprehensive weighted algorithm is effective and can support the evaluation and analysis of animation multimedia 3D lip synchronization.


Author(s):  
Anung Rachman ◽  
Risanuri Hidayat ◽  
Hanung Adi Nugroho

The lip synchronization technology of animation can run automatically through the phoneme-to-viseme map. Since the complexity of facial muscles causes the shape of the mouth to vary greatly, phoneme-to-viseme mapping always has challenging problems. One of them is the allophone vowel problem. The resemblance makes many researchers clustering them into one class. This paper discusses the certainty of allophone vowels as a variable of the phoneme-to-viseme map. Vowel allophones pre-processing as a proposed method is carried out through formant frequency feature extraction methods and then compared by t-test to find out the significance of the difference. The results of pre-processing are then used to reference the initial data when building phoneme-to-viseme maps. This research was conducted on maps and allophones of the Indonesian language. Maps that have been built are then compared with other maps using the HMM method in the value of word correctness and accuracy. The results show that viseme mapping preceded by allophonic pre-processing makes map performance more accurate when compared to other maps.


Author(s):  
Hao Zhu ◽  
Huaibo Huang ◽  
Yi Li ◽  
Aihua Zheng ◽  
Ran He

Talking face generation aims to synthesize a face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video via the given speech clip and facial image. Most existing methods mainly focus on either disentangling the information in a single image or learning temporal information between frames. However, cross-modality coherence between audio and video information has not been well addressed during synthesis. In this paper, we propose a novel arbitrary talking face generation framework by discovering the audio-visual coherence via the proposed Asymmetric Mutual Information Estimator (AMIE). In addition, we propose a Dynamic Attention (DA) block by selectively focusing the lip area of the input image during the training stage, to further enhance lip synchronization. Experimental results on benchmark LRW dataset and GRID dataset transcend the state-of-the-art methods on prevalent metrics with robust high-resolution synthesizing on gender and pose variations.


Sign in / Sign up

Export Citation Format

Share Document