scholarly journals A Multimodal Fusion Method Based on a Rotation Invariant Hierarchical Model for Finger-based Recognition

2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Hu Zhu ◽  
Ze Wang ◽  
Yu Shi ◽  
Yingying Hua ◽  
Guoxia Xu ◽  
...  

Multimodal fusion is one of the popular research directions of multimodal research, and it is also an emerging research field of artificial intelligence. Multimodal fusion is aimed at taking advantage of the complementarity of heterogeneous data and providing reliable classification for the model. Multimodal data fusion is to transform data from multiple single-mode representations to a compact multimodal representation. In previous multimodal data fusion studies, most of the research in this field used multimodal representations of tensors. As the input is converted into a tensor, the dimensions and computational complexity increase exponentially. In this paper, we propose a low-rank tensor multimodal fusion method with an attention mechanism, which improves efficiency and reduces computational complexity. We evaluate our model through three multimodal fusion tasks, which are based on a public data set: CMU-MOSI, IEMOCAP, and POM. Our model achieves a good performance while flexibly capturing the global and local connections. Compared with other multimodal fusions represented by tensors, experiments show that our model can achieve better results steadily under a series of attention mechanisms.


2021 ◽  
Vol 3 ◽  
Author(s):  
Juan Song ◽  
Jian Zheng ◽  
Ping Li ◽  
Xiaoyuan Lu ◽  
Guangming Zhu ◽  
...  

Alzheimer's disease (AD) is an irreversible brain disease that severely damages human thinking and memory. Early diagnosis plays an important part in the prevention and treatment of AD. Neuroimaging-based computer-aided diagnosis (CAD) has shown that deep learning methods using multimodal images are beneficial to guide AD detection. In recent years, many methods based on multimodal feature learning have been proposed to extract and fuse latent representation information from different neuroimaging modalities including magnetic resonance imaging (MRI) and 18-fluorodeoxyglucose positron emission tomography (FDG-PET). However, these methods lack the interpretability required to clearly explain the specific meaning of the extracted information. To make the multimodal fusion process more persuasive, we propose an image fusion method to aid AD diagnosis. Specifically, we fuse the gray matter (GM) tissue area of brain MRI and FDG-PET images by registration and mask coding to obtain a new fused modality called “GM-PET.” The resulting single composite image emphasizes the GM area that is critical for AD diagnosis, while retaining both the contour and metabolic characteristics of the subject's brain tissue. In addition, we use the three-dimensional simple convolutional neural network (3D Simple CNN) and 3D Multi-Scale CNN to evaluate the effectiveness of our image fusion method in binary classification and multi-classification tasks. Experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset indicate that the proposed image fusion method achieves better overall performance than unimodal and feature fusion methods, and that it outperforms state-of-the-art methods for AD diagnosis.


2018 ◽  
Vol 2018 ◽  
pp. 1-12
Author(s):  
Zhendong Wu ◽  
Jiajia Yang ◽  
Jianwu Zhang ◽  
Hengli Yue

Single biometric method has been widely used in the field of wireless multimedia authentication. However, it is vulnerable to spoofing and limited accuracy. To tackle this challenge, in this paper, we propose a multimodal fusion method for fingerprint and voiceprint by using a dynamic Bayesian method, which takes full advantage of the feature specificity extracted by a single biometrics project and authenticates users at the decision-making level. We demonstrate that this method can be extended to more modal biometric authentication and can achieve flexible accuracy of the authentication. The experiment of the method shows that the recognition rate and stability have been greatly improved, which achieves 4.46% and 5.94%, respectively, compared to the unimodal. Furthermore, it also increases 1.94% when compared with general multimodal methods for the biometric fusion recognition.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Kang Liu ◽  
Xin Gao

The use of multimodal sensors for lane line segmentation has become a growing trend. To achieve robust multimodal fusion, we introduced a new multimodal fusion method and proved its effectiveness in an improved fusion network. Specifically, a multiscale fusion module is proposed to extract effective features from data of different modalities, and a channel attention module is used to adaptively calculate the contribution of the fused feature channels. We verified the effect of multimodal fusion on the KITTI benchmark dataset and A2D2 dataset and proved the effectiveness of the proposed method on the enhanced KITTI dataset. Our method achieves robust lane line segmentation, which is 4.53% higher than the direct fusion on the precision index, and obtains the highest F2 score of 79.72%. We believe that our method introduces an optimization idea of modal data structure level for multimodal fusion.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jie Shan ◽  
Muhammad Talha

This article uses a multimodal smart music online teaching method combined with artificial intelligence to address the problem of smart music online teaching and to compensate for the shortcomings of the single modal classification method that only uses audio features for smart music online teaching. The selection of music intelligence models and classification models, as well as the analysis and processing of music characteristics, is the subjects of this article. It mainly studies how to use lyrics and how to combine audio and lyrics to intelligently classify music and teach multimodal and monomodal smart music online. In the online teaching of smart music based on lyrics, on the basis of the traditional wireless network node feature selection method, three parameters of frequency, concentration, and dispersion are introduced to adjust the statistical value of wireless network nodes, and an improved wireless network is proposed. After feature selection, the TFIDF method is used to calculate the weights, and then artificial intelligence is used to perform secondary dimensionality reduction on the lyrics. Experimental data shows that in the process of intelligently classifying lyrics, the accuracy of the traditional wireless network node feature selection method is 58.20%, and the accuracy of the improved wireless network node feature selection method is 67.21%, combined with artificial intelligence and improved wireless, the accuracy of the network node feature selection method is 69.68%. It can be seen that the third method has higher accuracy and lower dimensionality. In the online teaching of multimodal smart music based on audio and lyrics, this article improves the traditional fusion method for the problem of multimodal fusion and compares various fusion methods through experiments. The experimental results show that the improved classification effect of the fusion method is the best, reaching 84.43%, which verifies the feasibility and effectiveness of the method.


2002 ◽  
Vol 18 (1) ◽  
pp. 78-84 ◽  
Author(s):  
Eva Ullstadius ◽  
Jan-Eric Gustafsson ◽  
Berit Carlstedt

Summary: Vocabulary tests, part of most test batteries of general intellectual ability, measure both verbal and general ability. Newly developed techniques for confirmatory factor analysis of dichotomous variables make it possible to analyze the influence of different abilities on the performance on each item. In the testing procedure of the Computerized Swedish Enlistment test battery, eight different subtests of a new vocabulary test were given randomly to subsamples of a representative sample of 18-year-old male conscripts (N = 9001). Three central dimensions of a hierarchical model of intellectual abilities, general (G), verbal (Gc'), and spatial ability (Gv') were estimated under different assumptions of the nature of the data. In addition to an ordinary analysis of covariance matrices, assuming linearity of relations, the item variables were treated as categorical variables in the Mplus program. All eight subtests fit the hierarchical model, and the items were found to load about equally on G and Gc'. The results also indicate that if nonlinearity is not taken into account, the G loadings for the easy items are underestimated. These items, moreover, appear to be better measures of G than the difficult ones. The practical utility of the outcome for item selection and the theoretical implications for the question of the origin of verbal ability are discussed.


Author(s):  
Jon Andoni Duñabeitia ◽  
Manuel Perea ◽  
Manuel Carreiras

One essential issue for models of bilingual memory organization is to what degree the representation from one of the languages is shared with the other language. In this study, we examine whether there is a symmetrical translation priming effect with highly proficient, simultaneous bilinguals. We conducted a masked priming lexical decision experiment with cognate and noncognate translation equivalents. Results showed a significant masked translation priming effect for both cognates and noncognates, with a greater priming effect for cognates. Furthermore, the magnitude of the translation priming was similar in the two directions. Thus, highly fluent bilinguals do develop symmetrical between-language links, as predicted by the Revised Hierarchical model and the BIA+ model. We examine the implications of these results for models of bilingual memory.


2008 ◽  
Vol 9 (3) ◽  
pp. 154-166 ◽  
Author(s):  
Joseph M. Barron ◽  
Cindy Struckman-Johnson ◽  
Randal Quevillon ◽  
Sarah R. Banka
Keyword(s):  
Gay Men ◽  

2003 ◽  
Author(s):  
Lacey L. Schmidt ◽  
JoAnna Wood ◽  
Peter Sullivan

Sign in / Sign up

Export Citation Format

Share Document