multimodal classification
Recently Published Documents


TOTAL DOCUMENTS

84
(FIVE YEARS 35)

H-INDEX

13
(FIVE YEARS 3)

2022 ◽  
Author(s):  
Charles A Ellis ◽  
Mohammad SE Sendi ◽  
Rongen Zhang ◽  
Darwin A Carbajal ◽  
May D Wang ◽  
...  

Multimodal classification is increasingly common in biomedical informatics studies. Many such studies use deep learning classifiers with raw data, which makes explainability difficult. As such, only a few studies have applied explainability methods, and new methods are needed. In this study, we propose sleep stage classification as a testbed for method development and train a convolutional neural network with electroencephalogram (EEG), electrooculogram, and electromyogram data. We then present a global approach that is uniquely adapted for electrophysiology analysis. We further present two local approaches that can identify subject-level differences in explanations that would be obscured by global methods and that can provide insight into the effects of clinical and demographic variables upon the patterns learned by the classifier. We find that EEG is globally the most important modality for all sleep stages, except non-rapid eye movement stage 1 and that local subject-level differences in importance arise. We further show that sex, followed by medication and age had significant effects upon the patterns learned by the classifier. Our novel methods enhance explainability for the growing field of multimodal classification, provide avenues for the advancement of personalized medicine, and yield novel insights into the effects of demographic and clinical variables upon classifiers.


2021 ◽  
pp. 1-11
Author(s):  
Nan Su ◽  
Zhishuo Lin ◽  
Wenlong You ◽  
Nan Zheng ◽  
Kun Ma

Management of garbage classification is a general term for a series of activities to sort, store and transport garbage into public resources according to certain regulations or standards. Current garbage classification systems have several drawbacks, such as inability to identify multiple garbage categories, and high dependence on the surrounding environment. To address these issues, this paper has proposed the Real Time Multi-Modal Garbage classification System (abbreviated as RMGCS). It consists of two sub systems: an indoor garbage classification applet (abbreviated as IGCA) and an outdoor garbage classification system (abbreviated as OGCS). IGCA provides users with three methods of garbage classification, and OGCS provides users with outdoor real-time multi-target garbage classification and can dynamically update the recognition model. RMGCS achieves real-time, accurate, and multimodal classification. Finally, the experiments with RMGCS show that our approaches are effective and efficient.


2021 ◽  
Author(s):  
Qipin Chen ◽  
Zhenyu Shi ◽  
Zhen Zuo ◽  
Jinmiao Fu ◽  
Yi Sun

Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4133
Author(s):  
Farnoosh Heidarivincheh ◽  
Ryan McConville ◽  
Catherine Morgan ◽  
Roisin McNaney ◽  
Alessandro Masullo ◽  
...  

Parkinson’s disease (PD) is a chronic neurodegenerative condition that affects a patient’s everyday life. Authors have proposed that a machine learning and sensor-based approach that continuously monitors patients in naturalistic settings can provide constant evaluation of PD and objectively analyse its progression. In this paper, we make progress toward such PD evaluation by presenting a multimodal deep learning approach for discriminating between people with PD and without PD. Specifically, our proposed architecture, named MCPD-Net, uses two data modalities, acquired from vision and accelerometer sensors in a home environment to train variational autoencoder (VAE) models. These are modality-specific VAEs that predict effective representations of human movements to be fused and given to a classification module. During our end-to-end training, we minimise the difference between the latent spaces corresponding to the two data modalities. This makes our method capable of dealing with missing modalities during inference. We show that our proposed multimodal method outperforms unimodal and other multimodal approaches by an average increase in F1-score of 0.25 and 0.09, respectively, on a data set with real patients. We also show that our method still outperforms other approaches by an average increase in F1-score of 0.17 when a modality is missing during inference, demonstrating the benefit of training on multiple modalities.


2021 ◽  
Author(s):  
Charles A Ellis ◽  
Darwin A Carbajal ◽  
Rongen Zhang ◽  
Mohammad S. E. Sendi ◽  
Robyn L Miller ◽  
...  

With the growing use of multimodal data for deep learning classification in healthcare research, more studies have begun to present explainability methods for insight into multimodal classifiers. Among these studies, few have utilized local explainability methods, which could provide (1) insight into the classification of each sample and (2) an opportunity to better understand the effects of latent variables within datasets (e.g., medication of subjects in electrophysiology data). To the best of our knowledge, this opportunity has not yet been explored within multimodal classification. We present a novel local ablation approach that shows the importance of each modality to the correct classification of each class and explore the effects of latent variables upon the classifier. As a use-case, we train a convolutional neural network for automated sleep staging with electroencephalogram (EEG), electrooculogram (EOG), and electromyogram (EMG) data. We find that EEG is the most important modality across most stages, though EOG is particular important for non-rapid eye movement stage 1. Further, we identify significant relationships between the local explanations and subject age, sex, and state of medication which suggest that the classifier learned specific features associated with these variables across multiple modalities and correctly classified samples. Our novel explainability approach has implications for many fields involving multimodal classification. Moreover, our examination of the degree to which demographic and clinical variables may affect classifiers could provide direction for future studies in automated biomarker discovery.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2720
Author(s):  
Abdelrahman Ahmed ◽  
Khaled Shaalan ◽  
Sergio Toral ◽  
Yasser Hifny

The paper proposes three modeling techniques to improve the performance evaluation of the call center agent. The first technique is speech processing supported by an attention layer for the agent’s recorded calls. The speech comprises 65 features for the ultimate determination of the context of the call using the Open-Smile toolkit. The second technique uses the Max Weights Similarity (MWS) approach instead of the Softmax function in the attention layer to improve the classification accuracy. MWS function replaces the Softmax function for fine-tuning the output of the attention layer for processing text. It is formed by determining the similarity in the distance of input weights of the attention layer to the weights of the max vectors. The third technique combines the agent’s recorded call speech with the corresponding transcribed text for binary classification. The speech modeling and text modeling are based on combinations of the Convolutional Neural Networks (CNNs) and Bi-directional Long-Short Term Memory (BiLSTMs). In this paper, the classification results for each model (text versus speech) are proposed and compared with the multimodal approach’s results. The multimodal classification provided an improvement of (0.22%) compared with acoustic model and (1.7%) compared with text model.


Sign in / Sign up

Export Citation Format

Share Document