scholarly journals Separable convolutional neural networks for facial expressions recognition

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Andry Chowanda

AbstractSocial interactions are important for us, humans, as social creatures. Emotions play an important part in social interactions. They usually express meanings along with the spoken utterances to the interlocutors. Automatic facial expressions recognition is one technique to automatically capture, recognise, and understand emotions from the interlocutor. Many techniques proposed to increase the accuracy of emotions recognition from facial cues. Architecture such as convolutional neural networks demonstrates promising results for emotions recognition. However, most of the current models of convolutional neural networks require an enormous computational power to train and process emotional recognition. This research aims to build compact networks with depthwise separable layers while also maintaining performance. Three datasets and three other similar architectures were used to be compared with the proposed architecture. The results show that the proposed architecture performed the best among the other architectures. It achieved up to 13% better accuracy and 6–71% smaller and more compact than the other architectures. The best testing accuracy achieved by the architecture was 99.4%.

2021 ◽  
Author(s):  
Andry Chowanda

Abstract Social interactions are important for us, human, as social creatures. Emotions play an important part in social interactions. They usually express meanings along with the spoken utterances to the interlocutors. Automatic facial expressions recognition is one technique to automatically capture, recognise, and understand emotions from the interlocutor. Many techniques proposed to increase the accuracy of emotions recognition from facial cues. Architecture such as convolutional neural networks demonstrates promising results for emotions recognition. However, most of the current models of convolutional neural networks require an enormous computational power to train and process emotional recognition. This research aims to build compact networks with depthwise separable layers while also maintaining performance. Three datasets and three other similar architectures were used to be compared to the proposed architecture. The results show that the proposed architecture performed the best among the other architectures. It achieved up to 13% better accuracy and 6-71% smaller and more compact than the other architectures. The best testing accuracy achieved by the architecture was 99.4%.


Author(s):  
G. Touya ◽  
F. Brisebard ◽  
F. Quinton ◽  
A. Courtial

Abstract. Visually impaired people cannot use classical maps but can learn to use tactile relief maps. These tactile maps are crucial at school to learn geography and history as well as the other students. They are produced manually by professional transcriptors in a very long and costly process. A platform able to generate tactile maps from maps scanned from geography textbooks could be extremely useful to these transcriptors, to fasten their production. As a first step towards such a platform, this paper proposes a method to infer the scale and the content of the map from its image. We used convolutional neural networks trained with a few hundred maps from French geography textbooks, and the results show promising results to infer labels about the content of the map (e.g. ”there are roads, cities and administrative boundaries”), and to infer the extent of the map (e.g. a map of France or of Europe).


2021 ◽  
Author(s):  
Kosuke Honda ◽  
Hamido Fujita

In recent years, template-based methods such as Siamese network trackers and Correlation Filter (CF) based trackers have achieved state-of-the-art performance in several benchmarks. Recent Siamese network trackers use deep features extracted from convolutional neural networks to locate the target. However, the tracking performance of these trackers decreases when there are similar distractors to the object and the target object is deformed. On the other hand, correlation filter (CF)-based trackers that use handcrafted features (e.g., HOG features) to spatially locate the target. These two approaches have complementary characteristics due to differences in learning methods, features used, and the size of search regions. Also, we found that these trackers are complementary in terms of performance in benchmarking. Therefore, we propose the “Complementary Tracking framework using Average peak-to-correlation energy” (CTA). CTA is the generic object tracking framework that connects CF-trackers and Siamese-trackers in parallel and exploits the complementary features of these. In CTA, when a tracking failure of the Siamese tracker is detected using Average peak-to-correlation energy (APCE), which is an evaluation index of the response map matrix, the CF-trackers correct the output. In experimental on OTB100, CTA significantly improves the performance over the original tracker for several combinations of Siamese-trackers and CF-rackers.


Human feelings are mental conditions of sentiments that emerge immediately as opposed to cognitive exertion. Some of the basic feelings are happy, angry, neutral, sad and surprise. These internal feelings of a person are reflected on the face as Facial Expressions. This paper presents a novel methodology for Facial Expression Analysis which will aid to develop a facial expression recognition system. This system can be used in real time to classify five basic emotions. The recognition of facial expressions is important because of its applications in many domains such as artificial intelligence, security and robotics. Many different approaches can be used to overcome the problems of Facial Expression Recognition (FER) but the best suited technique for automated FER is Convolutional Neural Networks(CNN). Thus, a novel CNN architecture is proposed and a combination of multiple datasets such as FER2013, FER+, JAFFE and CK+ is used for training and testing. This helps to improve the accuracy and develop a robust real time system. The proposed methodology confers quite good results and the obtained accuracy may give encouragement and offer support to researchers to build better models for Automated Facial Expression Recognition systems.


2021 ◽  
Vol 11 (24) ◽  
pp. 11738
Author(s):  
Thomas Teixeira ◽  
Éric Granger ◽  
Alessandro Lameiras Koerich

Facial expressions are one of the most powerful ways to depict specific patterns in human behavior and describe the human emotional state. However, despite the impressive advances of affective computing over the last decade, automatic video-based systems for facial expression recognition still cannot correctly handle variations in facial expression among individuals as well as cross-cultural and demographic aspects. Nevertheless, recognizing facial expressions is a difficult task, even for humans. This paper investigates the suitability of state-of-the-art deep learning architectures based on convolutional neural networks (CNNs) to deal with long video sequences captured in the wild for continuous emotion recognition. For such an aim, several 2D CNN models that were designed to model spatial information are extended to allow spatiotemporal representation learning from videos, considering a complex and multi-dimensional emotion space, where continuous values of valence and arousal must be predicted. We have developed and evaluated convolutional recurrent neural networks, combining 2D CNNs and long short term-memory units and inflated 3D CNN models, which are built by inflating the weights of a pre-trained 2D CNN model during fine-tuning, using application-specific videos. Experimental results on the challenging SEWA-DB dataset have shown that these architectures can effectively be fine-tuned to encode spatiotemporal information from successive raw pixel images and achieve state-of-the-art results on such a dataset.


Jurnal INFORM ◽  
2020 ◽  
Vol 5 (2) ◽  
pp. 99
Author(s):  
Andi Sanjaya ◽  
Endang Setyati ◽  
Herman Budianto

This research was conducted to observe the use of architectural model Convolutional Neural Networks (CNN) LeNEt, which was suitable to use for Pandava mask objects. The Data processing in the research was 200 data for each class or similar with 1000 trial data. Architectural model CNN LeNET used input layer 32x32, 64x64, 128x128, 224x224 and 256x256. The trial result with the input layer 32x32 succeeded, showing a faster time compared to the other layer. The result of accuracy value and validation was not under fitted or overfit. However, when the activation of the second dense process as changed from the relu to sigmoid, the result was better in sigmoid, in the tem of time, and the possibility of overfitting was less. The research result had a mean accuracy value of 0.96.


2021 ◽  
Vol 14 (2) ◽  
pp. 236-247
Author(s):  
Michael Thiruthuvanathan ◽  
◽  
Balachandran Krishnan ◽  
Madhavi Rangaswamy ◽  
◽  
...  

Coatings ◽  
2020 ◽  
Vol 10 (2) ◽  
pp. 152 ◽  
Author(s):  
Zhun Fan ◽  
Chong Li ◽  
Ying Chen ◽  
Paola Di Mascio ◽  
Xiaopeng Chen ◽  
...  

Automated pavement crack detection and measurement are important road issues. Agencies have to guarantee the improvement of road safety. Conventional crack detection and measurement algorithms can be extremely time-consuming and low efficiency. Therefore, recently, innovative algorithms have received increased attention from researchers. In this paper, we propose an ensemble of convolutional neural networks (without a pooling layer) based on probability fusion for automated pavement crack detection and measurement. Specifically, an ensemble of convolutional neural networks was employed to identify the structure of small cracks with raw images. Secondly, outputs of the individual convolutional neural network model for the ensemble were averaged to produce the final crack probability value of each pixel, which can obtain a predicted probability map. Finally, the predicted morphological features of the cracks were measured by using the skeleton extraction algorithm. To validate the proposed method, some experiments were performed on two public crack databases (CFD and AigleRN) and the results of the different state-of-the-art methods were compared. To evaluate the efficiency of crack detection methods, three parameters were considered: precision (Pr), recall (Re) and F1 score (F1). For the two public databases of pavement images, the proposed method obtained the highest values of the three evaluation parameters: for the CFD database, Pr = 0.9552, Re = 0.9521 and F1 = 0.9533 (which reach values up to 0.5175 higher than the values obtained on the same database with the other methods), for the AigleRN database, Pr = 0.9302, Re = 0.9166 and F1 = 0.9238 (which reach values up to 0.7313 higher than the values obtained on the same database with the other methods). The experimental results show that the proposed method outperforms the other methods. For crack measurement, the crack length and width can be measure based on different crack types (complex, common, thin, and intersecting cracks.). The results show that the proposed algorithm can be effectively applied for crack measurement.


Sign in / Sign up

Export Citation Format

Share Document