multimodal information fusion
Recently Published Documents


TOTAL DOCUMENTS

41
(FIVE YEARS 16)

H-INDEX

9
(FIVE YEARS 2)

2021 ◽  
Vol 12 ◽  
Author(s):  
Haihua Tu

With the development of science and education, English learning has become increasingly important. In the past, English learning was mainly based on missionaries, and students were not very motivated to learn. The purpose of this article is to use the English cooperative model to improve the enthusiasm and initiative of students in learning, and to improve the efficiency of students in learning English. A team learning model based on the game is proposed. This article constructs a cooperative and competitive model of English learning based on multimodal information fusion. The main manifestation is that students form groups in small groups, and there is a competitive relationship between the groups. The competition among students in learning is the common interest of the entire group, so that the overall interests of each student will be more competitive. This article refers to the main body association model in the literature to adjust English grammar, vocabulary, and language perception ability: learn together in team communication to improve students' multifaceted abilities. Finally, a questionnaire was designed. The results show that after changing the English team learning mode and optimizing the English team learning support system of the students' English learning team, the English learning cooperation and competition model based on multimode information fusion proposed in this article can improve the learning effect by 55%-60%. In all English teaching, the two dimensions of professional knowledge and English ability training are not mutually orthogonal and mutually exclusive, but mutually supportive and interdependent. To form an effective teaching model of “student-centered and teacher-led,” active and rich communication and feedback in the classroom are the keys, and they also help to form a gradual teaching and learning cycle.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Shaosong Dou ◽  
Zhiquan Feng ◽  
Jinglan Tian ◽  
Xue Fan ◽  
Ya Hou ◽  
...  

This paper proposes an intention understanding algorithm (KDI) based on an elderly service robot, which combines Neural Network with a seminaive Bayesian classifier to infer user’s intention. KDI algorithm uses CNN to analyze gesture and action information, and YOLOV3 is used for object detection to provide scene information. Then, we enter them into a seminaive Bayesian classifier and set key properties as super parent to enhance its contribution to an intent, realizing intention understanding based on prior knowledge. In addition, we introduce the actual distance between the users and objects and give each object a different purpose to implement intent understanding based on object-user distance. The two methods are combined to enhance the intention understanding. The main contributions of this paper are as follows: (1) an intention reasoning model (KDI) is proposed based on prior knowledge and distance, which combines Neural Network with seminaive Bayesian classifier. (2) A set of robot accompanying systems based on the robot is formed, which is applied in the elderly service scene.


2021 ◽  
Author(s):  
Zhibing Xie

Understanding human emotional states is indispensable for our daily interaction, and we can enjoy more natural and friendly human computer interaction (HCI) experience by fully utilizing human’s affective states. In the application of emotion recognition, multimodal information fusion is widely used to discover the relationships of multiple information sources and make joint use of a number of channels, such as speech, facial expression, gesture and physiological processes. This thesis proposes a new framework of emotion recognition using information fusion based on the estimation of information entropy. The novel techniques of information theoretic learning are applied to feature level fusion and score level fusion. The most critical issues for feature level fusion are feature transformation and dimensionality reduction. The existing methods depend on the second order statistics, which is only optimal for Gaussian-like distributions. By incorporating information theoretic tools, a new feature level fusion method based on kernel entropy component analysis is proposed. For score level fusion, most previous methods focus on predefined rule based approaches, which are usually heuristic. In this thesis, a connection between information fusion and maximum correntropy criterion is established for effective score level fusion. Feature level fusion and score level fusion methods are then combined to introduce a two-stage fusion platform. The proposed methods are applied to audiovisual emotion recognition, and their effectiveness is evaluated by experiments on two publicly available audiovisual emotion databases. The experimental results demonstrate that the proposed algorithms achieve improved performance in comparison with the existing methods. The work of this thesis offers a promising direction to design more advanced emotion recognition systems based on multimodal information fusion and has great significance to the development of intelligent human computer interaction systems.


2021 ◽  
Author(s):  
Zhibing Xie

Understanding human emotional states is indispensable for our daily interaction, and we can enjoy more natural and friendly human computer interaction (HCI) experience by fully utilizing human’s affective states. In the application of emotion recognition, multimodal information fusion is widely used to discover the relationships of multiple information sources and make joint use of a number of channels, such as speech, facial expression, gesture and physiological processes. This thesis proposes a new framework of emotion recognition using information fusion based on the estimation of information entropy. The novel techniques of information theoretic learning are applied to feature level fusion and score level fusion. The most critical issues for feature level fusion are feature transformation and dimensionality reduction. The existing methods depend on the second order statistics, which is only optimal for Gaussian-like distributions. By incorporating information theoretic tools, a new feature level fusion method based on kernel entropy component analysis is proposed. For score level fusion, most previous methods focus on predefined rule based approaches, which are usually heuristic. In this thesis, a connection between information fusion and maximum correntropy criterion is established for effective score level fusion. Feature level fusion and score level fusion methods are then combined to introduce a two-stage fusion platform. The proposed methods are applied to audiovisual emotion recognition, and their effectiveness is evaluated by experiments on two publicly available audiovisual emotion databases. The experimental results demonstrate that the proposed algorithms achieve improved performance in comparison with the existing methods. The work of this thesis offers a promising direction to design more advanced emotion recognition systems based on multimodal information fusion and has great significance to the development of intelligent human computer interaction systems.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Hongli Zhang

A cross-modal speech-text retrieval method using interactive learning convolution automatic encoder (CAE) is proposed. First, an interactive learning autoencoder structure is proposed, including two inputs of speech and text, as well as processing links such as encoding, hidden layer interaction, and decoding, to complete the modeling of cross-modal speech-text retrieval. Then, the original audio signal is preprocessed and the Mel frequency cepstrum coefficient (MFCC) feature is extracted. In addition, the word bag model is used to extract the text features, and then the attention mechanism is used to combine the text and speech features. Through interactive learning CAE, the shared features of speech and text modes are obtained and then sent to modal classifier to identify modal information, so as to realize cross-modal voice text retrieval. Finally, experiments show that the performance of the proposed algorithm is better than that of the contrast algorithm in terms of recall rate, accuracy rate, and false recognition rate.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Dongli Wu ◽  
Jia Chen ◽  
Wei Deng ◽  
Yantao Wei ◽  
Heng Luo ◽  
...  

Teaching reflection based on videos is the main method in teacher education and professional development. However, it takes a long time to analyse videos, and teachers are easy to fall into the state of information overload. With the development of “AI + education,” automatic recognition of teacher behavior to support teaching reflection has become an important research topic. In this paper, taking online open classroom teaching video as the data source, we collected and constructed a teacher behavior dataset. Using this dataset, we explored the behavior recognition methods based on RGB video and skeleton information, and the information fusion between them is carried out to improve the recognition accuracy. The experimental results show that the fusion of RGB information and skeleton information can improve the recognition accuracy, and the early-fusion effect is better than the late-fusion effect. This study helps to solve the problems of time-consumption and information overload in teaching reflection and then helps teachers to optimize the teaching strategies and improve the teaching efficiency.


Sign in / Sign up

Export Citation Format

Share Document