scholarly journals Synch-Graph: Multisensory Emotion Recognition Through Neural Synchrony via Graph Convolutional Networks

2020 ◽  
Vol 34 (02) ◽  
pp. 1351-1358
Author(s):  
Esma Mansouri-Benssassi ◽  
Juan Ye

Human emotions are essentially multisensory, where emotional states are conveyed through multiple modalities such as facial expression, body language, and non-verbal and verbal signals. Therefore having multimodal or multisensory learning is crucial for recognising emotions and interpreting social signals. Existing multisensory emotion recognition approaches focus on extracting features on each modality, while ignoring the importance of constant interaction and co-learning between modalities. In this paper, we present a novel bio-inspired approach based on neural synchrony in audio-visual multisensory integration in the brain, named Synch-Graph. We model multisensory interaction using spiking neural networks (SNN) and explore the use of Graph Convolutional Networks (GCN) to represent and learn neural synchrony patterns. We hypothesise that modelling interactions between modalities will improve the accuracy of emotion recognition. We have evaluated Synch-Graph on two state-of-the-art datasets and achieved an overall accuracy of 98.3% and 96.82%, which are significantly higher than the existing techniques.

2020 ◽  
Vol 2020 ◽  
pp. 1-19
Author(s):  
Nazmi Sofian Suhaimi ◽  
James Mountstephens ◽  
Jason Teo

Emotions are fundamental for human beings and play an important role in human cognition. Emotion is commonly associated with logical decision making, perception, human interaction, and to a certain extent, human intelligence itself. With the growing interest of the research community towards establishing some meaningful “emotional” interactions between humans and computers, the need for reliable and deployable solutions for the identification of human emotional states is required. Recent developments in using electroencephalography (EEG) for emotion recognition have garnered strong interest from the research community as the latest developments in consumer-grade wearable EEG solutions can provide a cheap, portable, and simple solution for identifying emotions. Since the last comprehensive review was conducted back from the years 2009 to 2016, this paper will update on the current progress of emotion recognition using EEG signals from 2016 to 2019. The focus on this state-of-the-art review focuses on the elements of emotion stimuli type and presentation approach, study size, EEG hardware, machine learning classifiers, and classification approach. From this state-of-the-art review, we suggest several future research opportunities including proposing a different approach in presenting the stimuli in the form of virtual reality (VR). To this end, an additional section devoted specifically to reviewing only VR studies within this research domain is presented as the motivation for this proposed new approach using VR as the stimuli presentation device. This review paper is intended to be useful for the research community working on emotion recognition using EEG signals as well as for those who are venturing into this field of research.


2021 ◽  
Vol 11 (15) ◽  
pp. 6975
Author(s):  
Tao Zhang ◽  
Lun He ◽  
Xudong Li ◽  
Guoqing Feng

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4233
Author(s):  
Bogdan Mocanu ◽  
Ruxandra Tapu ◽  
Titus Zaharia

Emotion is a form of high-level paralinguistic information that is intrinsically conveyed by human speech. Automatic speech emotion recognition is an essential challenge for various applications; including mental disease diagnosis; audio surveillance; human behavior understanding; e-learning and human–machine/robot interaction. In this paper, we introduce a novel speech emotion recognition method, based on the Squeeze and Excitation ResNet (SE-ResNet) model and fed with spectrogram inputs. In order to overcome the limitations of the state-of-the-art techniques, which fail in providing a robust feature representation at the utterance level, the CNN architecture is extended with a trainable discriminative GhostVLAD clustering layer that aggregates the audio features into compact, single-utterance vector representation. In addition, an end-to-end neural embedding approach is introduced, based on an emotionally constrained triplet loss function. The loss function integrates the relations between the various emotional patterns and thus improves the latent space data representation. The proposed methodology achieves 83.35% and 64.92% global accuracy rates on the RAVDESS and CREMA-D publicly available datasets, respectively. When compared with the results provided by human observers, the gains in global accuracy scores are superior to 24%. Finally, the objective comparative evaluation with state-of-the-art techniques demonstrates accuracy gains of more than 3%.


Cybersecurity ◽  
2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Shushan Arakelyan ◽  
Sima Arasteh ◽  
Christophe Hauser ◽  
Erik Kline ◽  
Aram Galstyan

AbstractTackling binary program analysis problems has traditionally implied manually defining rules and heuristics, a tedious and time consuming task for human analysts. In order to improve automation and scalability, we propose an alternative direction based on distributed representations of binary programs with applicability to a number of downstream tasks. We introduce Bin2vec, a new approach leveraging Graph Convolutional Networks (GCN) along with computational program graphs in order to learn a high dimensional representation of binary executable programs. We demonstrate the versatility of this approach by using our representations to solve two semantically different binary analysis tasks – functional algorithm classification and vulnerability discovery. We compare the proposed approach to our own strong baseline as well as published results, and demonstrate improvement over state-of-the-art methods for both tasks. We evaluated Bin2vec on 49191 binaries for the functional algorithm classification task, and on 30 different CWE-IDs including at least 100 CVE entries each for the vulnerability discovery task. We set a new state-of-the-art result by reducing the classification error by 40% compared to the source-code based inst2vec approach, while working on binary code. For almost every vulnerability class in our dataset, our prediction accuracy is over 80% (and over 90% in multiple classes).


Author(s):  
Tanaz Molapour ◽  
Cindy C Hagan ◽  
Brian Silston ◽  
Haiyan Wu ◽  
Maxwell Ramstead ◽  
...  

ABSTRACT The social environment presents the human brain with the most complex of information processing demands. The computations that the brain must perform occur in parallel, combine social and nonsocial cues, produce verbal and non-verbal signals, and involve multiple cognitive systems; including memory, attention, emotion, learning. This occurs dynamically and at timescales ranging from milliseconds to years. Here, we propose that during social interactions, seven core operations interact to underwrite coherent social functioning; these operations accumulate evidence efficiently – from multiple modalities – when inferring what to do next. We deconstruct the social brain and outline the key components entailed for successful human social interaction. These include (1) social perception; (2) social inferences, such as mentalizing; (3) social learning; (4) social signaling through verbal and non-verbal cues; (5) social drives (e.g., how to increase one’s status); (6) determining the social identity of agents, including oneself; and (7) minimizing uncertainty within the current social context by integrating sensory signals and inferences. We argue that while it is important to examine these distinct aspects of social inference, to understand the true nature of the human social brain, we must also explain how the brain integrates information from the social world.


2021 ◽  
Vol 376 (1821) ◽  
pp. 20190765 ◽  
Author(s):  
Giovanni Pezzulo ◽  
Joshua LaPalme ◽  
Fallon Durant ◽  
Michael Levin

Nervous systems’ computational abilities are an evolutionary innovation, specializing and speed-optimizing ancient biophysical dynamics. Bioelectric signalling originated in cells' communication with the outside world and with each other, enabling cooperation towards adaptive construction and repair of multicellular bodies. Here, we review the emerging field of developmental bioelectricity, which links the field of basal cognition to state-of-the-art questions in regenerative medicine, synthetic bioengineering and even artificial intelligence. One of the predictions of this view is that regeneration and regulative development can restore correct large-scale anatomies from diverse starting states because, like the brain, they exploit bioelectric encoding of distributed goal states—in this case, pattern memories. We propose a new interpretation of recent stochastic regenerative phenotypes in planaria, by appealing to computational models of memory representation and processing in the brain. Moreover, we discuss novel findings showing that bioelectric changes induced in planaria can be stored in tissue for over a week, thus revealing that somatic bioelectric circuits in vivo can implement a long-term, re-writable memory medium. A consideration of the mechanisms, evolution and functionality of basal cognition makes novel predictions and provides an integrative perspective on the evolution, physiology and biomedicine of information processing in vivo . This article is part of the theme issue ‘Basal cognition: multicellularity, neurons and the cognitive lens’.


2016 ◽  
Vol 371 (1688) ◽  
pp. 20150106 ◽  
Author(s):  
Margaret M. McCarthy

Studies of sex differences in the brain range from reductionistic cell and molecular analyses in animal models to functional imaging in awake human subjects, with many other levels in between. Interpretations and conclusions about the importance of particular differences often vary with differing levels of analyses and can lead to discord and dissent. In the past two decades, the range of neurobiological, psychological and psychiatric endpoints found to differ between males and females has expanded beyond reproduction into every aspect of the healthy and diseased brain, and thereby demands our attention. A greater understanding of all aspects of neural functioning will only be achieved by incorporating sex as a biological variable. The goal of this review is to highlight the current state of the art of the discipline of sex differences research with an emphasis on the brain and to contextualize the articles appearing in the accompanying special issue.


Author(s):  
Miao Cheng ◽  
Ah Chung Tsoi

As a general means of expression, audio analysis and recognition have attracted much attention for its wide applications in real-life world. Audio emotion recognition (AER) attempts to understand the emotional states of human with the given utterance signals, and has been studied abroad for its further development on friendly human–machine interfaces. Though there have been several the-state-of-the-arts auditory methods devised to audio recognition, most of them focus on discriminative usage of acoustic features, while feedback efficiency of recognition demands is ignored. This makes possible application of AER, and rapid learning of emotion patterns is desired. In order to make predication of audio emotion possible, the speaker-dependent patterns of audio emotions are learned with multiresolution analysis, and fractal dimension (FD) features are calculated for acoustic feature extraction. Furthermore, it is able to efficiently learn the intrinsic characteristics of auditory emotions, while the utterance features are learned from FDs of each sub-band. Experimental results show the proposed method is able to provide comparative performance for AER.


Author(s):  
Pengcheng Wang ◽  
Jonathan Rowe ◽  
Wookhee Min ◽  
Bradford Mott ◽  
James Lester

Interactive narrative planning offers significant potential for creating adaptive gameplay experiences. While data-driven techniques have been devised that utilize player interaction data to induce policies for interactive narrative planners, they require enormously large gameplay datasets. A promising approach to addressing this challenge is creating simulated players whose behaviors closely approximate those of human players. In this paper, we propose a novel approach to generating high-fidelity simulated players based on deep recurrent highway networks and deep convolutional networks. Empirical results demonstrate that the proposed models significantly outperform the prior state-of-the-art in generating high-fidelity simulated player models that accurately imitate human players’ narrative interactions. Using the high-fidelity simulated player models, we show the advantage of more exploratory reinforcement learning methods for deriving generalizable narrative adaptation policies.


Sign in / Sign up

Export Citation Format

Share Document