multimodal language
Recently Published Documents


TOTAL DOCUMENTS

71
(FIVE YEARS 28)

H-INDEX

8
(FIVE YEARS 4)

2022 ◽  
Vol 4 ◽  
Author(s):  
Ziyan Yang ◽  
Leticia Pinto-Alva ◽  
Franck Dernoncourt ◽  
Vicente Ordonez

People are able to describe images using thousands of languages, but languages share only one visual world. The aim of this work is to use the learned intermediate visual representations from a deep convolutional neural network to transfer information across languages for which paired data is not available in any form. Our work proposes using backpropagation-based decoding coupled with transformer-based multilingual-multimodal language models in order to obtain translations between any languages used during training. We particularly show the capabilities of this approach in the translation of German-Japanese and Japanese-German sentence pairs, given a training data of images freely associated with text in English, German, and Japanese but for which no single image contains annotations in both Japanese and German. Moreover, we demonstrate that our approach is also generally useful in the multilingual image captioning task when sentences in a second language are available at test time. The results of our method also compare favorably in the Multi30k dataset against recently proposed methods that are also aiming to leverage images as an intermediate source of translations.


2021 ◽  
Author(s):  
Vishal Anand ◽  
Raksha Ramesh ◽  
Boshen Jin ◽  
Ziyin Wang ◽  
Xiaoxiao Lei ◽  
...  

2021 ◽  
Author(s):  
Aaron Eisermann ◽  
Jae Hee Lee ◽  
Cornelius Weber ◽  
Stefan Wermter

2021 ◽  
Vol 7 (s4) ◽  
Author(s):  
Paul Sambre

Abstract This contribution examines how an expert musician teaches high pitch as an embodied practice in a digital instruction video. Musical meaning-making in this perspective calls for a naturalized phenomenology which deals with the practice of music teaching, which involves a performing body. The notion of high musical pitch in terms of an abstract embodied image schema is challenged in favor of a multidimensional body schema, conceptualized at the interface between multimodal language, i.e. in speech and gesture, and the affordances imposed on musical production by the human body and the instrument artefact. As a result, the traditional metaphorical take on upward verticality, movement and causal force in image schemata becomes a conceptual background which may lead to errors on behalf of the potential student, and needs to be further enriched by natural local corporeal dimensions: immobility, non-vertical change in the lips, mouth and air flow. Such body schemata can be used in teaching more dynamic concepts about enactive knowledge in the body in interactive contexts of knowledge transmission.


2021 ◽  
Vol 63 ◽  
pp. 100903
Author(s):  
Kwangok Song ◽  
Kyle M. Williams ◽  
Diane L. Schallert ◽  
Alina Adonyi Pruitt

2021 ◽  
Author(s):  
Zhonggen Yu

Abstract This special pandemic time has forced billions of learners to receive multimodal language pedagogy. By combing Rain Classroom, WeChat, Massive Open Online Courses (MOOCs), with the traditional face-to-face classroom, this study implements a research on the effectiveness of multimodal language pedagogy among English majors (N = 59). It is concluded that the multimodal language pedagogy may improve language learning outcomes compared with the traditional one although there are still many disputes over this approach. The essence of the multimodal language pedagogy may be how to appropriately design rather than what digital tools are involved. Future research may focus on how to design effective multimodal language pedagogy by reducing the distractions and increase the interactions.


2021 ◽  
Vol 66 ◽  
pp. 184-197
Author(s):  
Dimitris Gkoumas ◽  
Qiuchi Li ◽  
Christina Lioma ◽  
Yijun Yu ◽  
Dawei Song

2021 ◽  
Author(s):  
Wim Pouw ◽  
Jan de Wit ◽  
Sara Bögels ◽  
Marlou Rasenberg ◽  
Branka Milivojevic ◽  
...  

Most manual communicative gestures that humans produce cannot be looked up in a dictionary, as these manual gestures inherit their meaning in large part from the communicative context and are not conventionalized. However, it is understudied to what extent the communicative signal as such — bodily postures in movement, or kinematics — can inform about gesture semantics. Can we construct, in principle, a distribution-based semantics of gesture kinematics, similar to how word vectorization methods in NLP (Natural language Processing) are now widely used to study semantic properties in text and speech? For such a project to get off the ground, we need to know the extent to which semantically similar gestures are more likely to be kinematically similar. In study 1 we assess whether semantic word2vec distances between the conveyed concepts participants were explicitly instructed to convey in silent gestures, relate to the kinematic distances of these gestures as obtained from Dynamic Time Warping (DTW). In a second director-matcher dyadic study we assess kinematic similarity between spontaneous co-speech gestures produced between interacting participants. Participants were asked before and after they interacted how they would name the objects. The semantic distances between the resulting names were related to the gesture kinematic distances of gestures that were made in the context of conveying those objects in the interaction. We find that the gestures’ semantic relatedness is reliably predictive of kinematic relatedness across these highly divergent studies, which suggests that the development of an NLP method of deriving semantic relatedness from kinematics is a promising avenue for future developments in automated multimodal recognition. Deeper implications for statistical learning processes in multimodal language are discussed.


Sign in / Sign up

Export Citation Format

Share Document