multimodal information
Recently Published Documents


TOTAL DOCUMENTS

312
(FIVE YEARS 102)

H-INDEX

19
(FIVE YEARS 5)

2022 ◽  
Vol 2022 ◽  
pp. 1-11
Author(s):  
Meichao Yan ◽  
Yu Wen ◽  
Qingxuan Shi ◽  
Xuedong Tian

Aiming at the defects of traditional full-text retrieval models in dealing with mathematical expressions, which are special objects different from ordinary texts, a multimodal retrieval and ranking method for scientific documents based on hesitant fuzzy sets (HFS) and XLNet is proposed. This method integrates multimodal information, such as mathematical expression images and context text, as keywords to realize the retrieval of scientific documents. In the image modal, the images of mathematical expressions are recognized, and the hesitancy fuzzy set theory is introduced to calculate the hesitancy fuzzy similarity between mathematical query expressions and the mathematical expressions in candidate scientific documents. Meanwhile, in the text mode, XLNet is used to generate word vectors of the mathematical expression context to obtain the similarity between the query text and the mathematical expression context of the candidate scientific documents. Finally, the multimodal evaluation is integrated, and the hesitation fuzzy set is constructed at the document level to obtain the final scores of the scientific documents and corresponding ranked output. The experimental results show that the recall and precision of this method are 0.774 and 0.663 on the NTCIR dataset, respectively, and the average normalized discounted cumulative gain (NDCG) value of the top-10 ranking results is 0.880 on the Chinese scientific document (CSD) dataset.


2022 ◽  
Vol 6 (1) ◽  
pp. 4
Author(s):  
Insook Choi

The article presents a contextual survey of eight contributions in the special issue Musical Interactions (Volume I) in Multimodal Technologies and Interaction. The presentation includes (1) a critical examination of what it means to be musical, to devise the concept of music proper to MTI as well as multicultural proximity, and (2) a conceptual framework for instrumentation, design, and assessment of musical interaction research through five enabling dimensions: Affordance; Design Alignment; Adaptive Learning; Second-Order Feedback; Temporal Integration. Each dimension is discussed and applied in the survey. The results demonstrate how the framework provides an interdisciplinary scope required for musical interaction, and how this approach may offer a coherent way to describe and assess approaches to research and design as well as implementations of interactive musical systems. Musical interaction stipulates musical liveness for experiencing both music and technologies. While music may be considered ontologically incomplete without a listener, musical interaction is defined as ontological completion of a state of music and listening through a listener’s active engagement with musical resources in multimodal information flow.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Peng Li ◽  
Qian Wang

In order to further mine the deep semantic information of the microbial text of public health emergencies, this paper proposes a multichannel microbial sentiment analysis model MCMF-A. Firstly, we use word2vec and fastText to generate word vectors in the feature vector embedding layer and fuse them with lexical and location feature vectors; secondly, we build a multichannel layer based on CNN and BiLSTM to extract local and global features of the microbial text; then we build an attention mechanism layer to extract the important semantic features of the microbial text; thirdly, we merge the multichannel output in the fusion layer and use soft; finally, the results are merged in the fusion layer, and a surtax function is used in the output layer for sentiment classification. The results show that the F1 value of the MCMF-A sentiment analysis model reaches 90.21%, which is 9.71% and 9.14% higher than the benchmark CNN and BiLSTM models, respectively. The constructed dataset is small in size, and the multimodal information such as images and speech has not been considered.


Author(s):  
Ryoko Sasamoto ◽  
Stephen Doherty ◽  
Minako O’Hagan

Abstract The use of captions has grown in recent years in both traditional and new media, particularly in terms of the diversity of style, content, and function. Impact captions have emerged as a popular form of captions for hearing viewers and contain rich multimodal information which is employed to capture viewer attention and enhance engagement, particularly in situations where there is competition for viewer attention. Drawing upon relevance theory, we argue how impact captions could effectively attract and hold visual attention owing to their balance between processing effort and contextual effects. This exploratory study employs a dual-task paradigm and uses authentic materials and viewing situations to further examine the ability of multimodal impact captions to attract and retain overt visual attention amongst a small sample of TV viewers. Our results provide novel insight into the apparent highly individualised efficacy of impact captions, where we identify several variables of interest in participants’ viewing behaviours. We conclude with a discussion of the study’s contributions, limitations, and an outline for future work.


2021 ◽  
pp. 1-21
Author(s):  
Yoshinobu Hagiwara ◽  
Keishiro Taguchi ◽  
Satoshi Ishibushi ◽  
Akira Taniguchi ◽  
Tadahiro Taniguchi

2021 ◽  
Vol 12 ◽  
Author(s):  
Haihua Tu

With the development of science and education, English learning has become increasingly important. In the past, English learning was mainly based on missionaries, and students were not very motivated to learn. The purpose of this article is to use the English cooperative model to improve the enthusiasm and initiative of students in learning, and to improve the efficiency of students in learning English. A team learning model based on the game is proposed. This article constructs a cooperative and competitive model of English learning based on multimodal information fusion. The main manifestation is that students form groups in small groups, and there is a competitive relationship between the groups. The competition among students in learning is the common interest of the entire group, so that the overall interests of each student will be more competitive. This article refers to the main body association model in the literature to adjust English grammar, vocabulary, and language perception ability: learn together in team communication to improve students' multifaceted abilities. Finally, a questionnaire was designed. The results show that after changing the English team learning mode and optimizing the English team learning support system of the students' English learning team, the English learning cooperation and competition model based on multimode information fusion proposed in this article can improve the learning effect by 55%-60%. In all English teaching, the two dimensions of professional knowledge and English ability training are not mutually orthogonal and mutually exclusive, but mutually supportive and interdependent. To form an effective teaching model of “student-centered and teacher-led,” active and rich communication and feedback in the classroom are the keys, and they also help to form a gradual teaching and learning cycle.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Shaosong Dou ◽  
Zhiquan Feng ◽  
Jinglan Tian ◽  
Xue Fan ◽  
Ya Hou ◽  
...  

This paper proposes an intention understanding algorithm (KDI) based on an elderly service robot, which combines Neural Network with a seminaive Bayesian classifier to infer user’s intention. KDI algorithm uses CNN to analyze gesture and action information, and YOLOV3 is used for object detection to provide scene information. Then, we enter them into a seminaive Bayesian classifier and set key properties as super parent to enhance its contribution to an intent, realizing intention understanding based on prior knowledge. In addition, we introduce the actual distance between the users and objects and give each object a different purpose to implement intent understanding based on object-user distance. The two methods are combined to enhance the intention understanding. The main contributions of this paper are as follows: (1) an intention reasoning model (KDI) is proposed based on prior knowledge and distance, which combines Neural Network with seminaive Bayesian classifier. (2) A set of robot accompanying systems based on the robot is formed, which is applied in the elderly service scene.


2021 ◽  
Author(s):  
Taner Cagali ◽  
Mehrnoosh Sadrzadeh ◽  
Chris Newell

2021 ◽  
Vol 15 ◽  
Author(s):  
Kai Chen ◽  
Lijie Wang ◽  
Jianguang Zeng ◽  
Ai Chen ◽  
Zhao Gao ◽  
...  

The association cortices of the brain are essential for integrating multimodal information that subserves complex and high-order cognitive functions. To delineate the changing pattern of associative cortices can provide critical insight into brain development, aging, plasticity, and disease-triggered functional abnormalities. However, how to quantitatively characterize the association capability of the brain is elusive. Here, we developed a new method of association index (Asso) at the voxel level to quantitatively characterize the brain association ability. Using the Asso method, we found high Asso values in association cortical networks, and low values in visual and limbic networks, suggesting a pattern of significant gradient distribution in neural functions. The spatial distribution patterns of Asso show high similarities across different thresholds suggesting that Asso mapping is a threshold-free method. In addition, compared with functional connectivity strength, i.e., degree centrality method, Asso mapping showed different patterns for association cortices and primary cortices. Finally, the Asso method was applied to investigate aging effects and identified similar findings with previous studies. All these results indicated that Asso can characterize the brain association patterns effectively and open a new avenue to reveal a neural basis for development, aging, and brain disorders.


2021 ◽  
Vol 15 ◽  
Author(s):  
Shoji Tanaka ◽  
Eiji Kirino

Performing an opera requires singers on stage to process mental imagery and theory of mind tasks in conjunction with singing and action control. Although it is conceivable that the precuneus, as a posterior hub of the default mode network, plays an important role in opera performance, how the precuneus contributes to opera performance has not been elucidated yet. In this study, we aimed to investigate the contribution of the precuneus to singing in an opera. Since the precuneus processes mental scenes, which are multimodal and integrative, we hypothesized that it is involved in opera performance by integrating multimodal information required for performing a character in an opera. We tested this hypothesis by analyzing the functional connectivity of the precuneus during imagined singing and rest. This study included 42 opera singers who underwent functional magnetic resonance imaging when performing “imagined operatic singing” with their eyes closed. During imagined singing, the precuneus showed increased functional connectivity with brain regions related to language, mirror neuron, socio-cognitive/emotional, and reward processing. Our findings suggest that, with the aid of its widespread connectivity, the precuneus and its network allow embodiment and multimodal integration of mental scenes. This information processing is necessary for imagined singing as well as performing an opera. We propose a novel role of the precuneus in opera performance.


Sign in / Sign up

Export Citation Format

Share Document