scholarly journals Learning face similarities for face verification using hybrid convolutional neural networks

Author(s):  
Fadhlan Hafizhelmi Kamaru Zaman ◽  
Juliana Johari ◽  
Ahmad Ihsan Mohd Yassin

<span>Face verification focuses on the task of determining whether two face images belong to the same identity or not. For unrestricted faces in the wild, this is a very challenging task. Besides significant degradation due to images that have large variations in pose, illumination, expression, aging, and occlusions, it also suffers from large-scale ever-expanding data needed to perform one-to-many recognition task. In this paper, we propose a face verification method by learning face similarities using a Convolutional Neural Networks (ConvNet). Instead of extracting features from each face image separately, our ConvNet model jointly extracts relational visual features from two face images in comparison. We train four hybrid ConvNet models to learn how to distinguish similarities between the face pair of four different face portions and join them at top-layer classifier level. We use binary-class classifier at top-layer level to identify the similarity of face pairs which includes a conventional Multi-Layer Perceptron (MLP), Support Vector Machines (SVM), Native Bayes, and another ConvNet. There are 3 face pairing configurations discussed in this paper. Results from experiments using Labeled face in the Wild (LFW) and CelebA datasets indicate that our hybrid ConvNet increases the face verification accuracy by as much as 27% when compared to individual ConvNet approach. We also found that Lateral face pair configuration yields the best LFW test accuracy on a very strict test protocol without any face alignment using MLP as top-layer classifier at 87.89%, which on-par with the state-of-the-arts. We showed that our approach is more flexible in terms of inferencing the learned models on out-of-sample data by testing LFW and CelebA on either model.</span>


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yanfei Li ◽  
Xianying Feng ◽  
Yandong Liu ◽  
Xingchang Han

AbstractThis work researched apple quality identification and classification from real images containing complicated disturbance information (background was similar to the surface of the apples). This paper proposed a novel model based on convolutional neural networks (CNN) which aimed at accurate and fast grading of apple quality. Specific, complex, and useful image characteristics for detection and classification were captured by the proposed model. Compared with existing methods, the proposed model could better learn high-order features of two adjacent layers that were not in the same channel but were very related. The proposed model was trained and validated, with best training and validation accuracy of 99% and 98.98% at 2590th and 3000th step, respectively. The overall accuracy of the proposed model tested using an independent 300 apple dataset was 95.33%. The results showed that the training accuracy, overall test accuracy and training time of the proposed model were better than Google Inception v3 model and traditional imaging process method based on histogram of oriented gradient (HOG), gray level co-occurrence matrix (GLCM) features merging and support vector machine (SVM) classifier. The proposed model has great potential in Apple’s quality detection and classification.



Author(s):  
Antonio Greco ◽  
Alessia Saggese ◽  
Mario Vento ◽  
Vincenzo Vigilante

AbstractIn the era of deep learning, the methods for gender recognition from face images achieve remarkable performance over most of the standard datasets. However, the common experimental analyses do not take into account that the face images given as input to the neural networks are often affected by strong corruptions not always represented in standard datasets. In this paper, we propose an experimental framework for gender recognition “in the wild”. We produce a corrupted version of the popular LFW+ and GENDER-FERET datasets, that we call LFW+C and GENDER-FERET-C, and evaluate the accuracy of nine different network architectures in presence of specific, suitably designed, corruptions; in addition, we perform an experiment on the MIVIA-Gender dataset, recorded in real environments, to analyze the effects of mixed image corruptions happening in the wild. The experimental analysis demonstrates that the robustness of the considered methods can be further improved, since all of them are affected by a performance drop on images collected in the wild or manually corrupted. Starting from the experimental results, we are able to provide useful insights for choosing the best currently available architecture in specific real conditions. The proposed experimental framework, whose code is publicly available, is general enough to be applicable also on different datasets; thus, it can act as a forerunner for future investigations.



Author(s):  
Alexis David Pascual ◽  
Kenneth McIsaac ◽  
Gordon Osinski

Autonomous image recognition has numerous potential applications in the field of planetary science and geology. For instance, having the ability to classify images of rocks would allow geologists to have immediate feedback without having to bring back samples to the laboratory. Also, planetary rovers could classify rocks in remote places and even in other planets without needing human intervention. Shu et al. classified 9 different types of rock images using a Support Vector Machine (SVM) with the image features extracted autonomously. Through this method, the authors achieved a test accuracy of 96.71%. In this research, Convolutional Neural Networks(CNN) have been used to classify the same set of rock images. Results show that a 3-layer network obtains an average accuracy of 99.60% across 10 trials on the test set. A version of Self-taught Learning was also implemented to prove the generalizability of the features extracted by the CNN. Finally, one model has been chosen to be deployed on a mobile device to demonstrate practicality and portability. The deployed model achieves a perfect classification accuracy on the test set, while taking only 0.068 seconds to make a prediction, equivalent to about 14 frames per second.



2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Zhe Xu ◽  
Xi Guo ◽  
Anfan Zhu ◽  
Xiaolin He ◽  
Xiaomin Zhao ◽  
...  

Symptoms of nutrient deficiencies in rice plants often appear on the leaves. The leaf color and shape, therefore, can be used to diagnose nutrient deficiencies in rice. Image classification is an efficient and fast approach for this diagnosis task. Deep convolutional neural networks (DCNNs) have been proven to be effective in image classification, but their use to identify nutrient deficiencies in rice has received little attention. In the present study, we explore the accuracy of different DCNNs for diagnosis of nutrient deficiencies in rice. A total of 1818 photographs of plant leaves were obtained via hydroponic experiments to cover full nutrition and 10 classes of nutrient deficiencies. The photographs were divided into training, validation, and test sets in a 3 : 1 : 1 ratio. Fine-tuning was performed to evaluate four state-of-the-art DCNNs: Inception-v3, ResNet with 50 layers, NasNet-Large, and DenseNet with 121 layers. All the DCNNs obtained validation and test accuracies of over 90%, with DenseNet121 performing best (validation accuracy = 98.62 ± 0.57%; test accuracy = 97.44 ± 0.57%). The performance of the DCNNs was validated by comparison to color feature with support vector machine and histogram of oriented gradient with support vector machine. This study demonstrates that DCNNs provide an effective approach to diagnose nutrient deficiencies in rice.



2021 ◽  
Vol 40 (1) ◽  
Author(s):  
David Müller ◽  
Andreas Ehlen ◽  
Bernd Valeske

AbstractConvolutional neural networks were used for multiclass segmentation in thermal infrared face analysis. The principle is based on existing image-to-image translation approaches, where each pixel in an image is assigned to a class label. We show that established networks architectures can be trained for the task of multiclass face analysis in thermal infrared. Created class annotations consisted of pixel-accurate locations of different face classes. Subsequently, the trained network can segment an acquired unknown infrared face image into the defined classes. Furthermore, face classification in live image acquisition is shown, in order to be able to display the relative temperature in real-time from the learned areas. This allows a pixel-accurate temperature face analysis e.g. for infection detection like Covid-19. At the same time our approach offers the advantage of concentrating on the relevant areas of the face. Areas of the face irrelevant for the relative temperature calculation or accessories such as glasses, masks and jewelry are not considered. A custom database was created to train the network. The results were quantitatively evaluated with the intersection over union (IoU) metric. The methodology shown can be transferred to similar problems for more quantitative thermography tasks like in materials characterization or quality control in production.



2021 ◽  
Vol 5 (2) ◽  
Author(s):  
Alexander Knyshov ◽  
Samantha Hoang ◽  
Christiane Weirauch

Abstract Automated insect identification systems have been explored for more than two decades but have only recently started to take advantage of powerful and versatile convolutional neural networks (CNNs). While typical CNN applications still require large training image datasets with hundreds of images per taxon, pretrained CNNs recently have been shown to be highly accurate, while being trained on much smaller datasets. We here evaluate the performance of CNN-based machine learning approaches in identifying three curated species-level dorsal habitus datasets for Miridae, the plant bugs. Miridae are of economic importance, but species-level identifications are challenging and typically rely on information other than dorsal habitus (e.g., host plants, locality, genitalic structures). Each dataset contained 2–6 species and 126–246 images in total, with a mean of only 32 images per species for the most difficult dataset. We find that closely related species of plant bugs can be identified with 80–90% accuracy based on their dorsal habitus alone. The pretrained CNN performed 10–20% better than a taxon expert who had access to the same dorsal habitus images. We find that feature extraction protocols (selection and combination of blocks of CNN layers) impact identification accuracy much more than the classifying mechanism (support vector machine and deep neural network classifiers). While our network has much lower accuracy on photographs of live insects (62%), overall results confirm that a pretrained CNN can be straightforwardly adapted to collection-based images for a new taxonomic group and successfully extract relevant features to classify insect species.



Animals ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 1263
Author(s):  
Zhaojun Wang ◽  
Jiangning Wang ◽  
Congtian Lin ◽  
Yan Han ◽  
Zhaosheng Wang ◽  
...  

With the rapid development of digital technology, bird images have become an important part of ornithology research data. However, due to the rapid growth of bird image data, it has become a major challenge to effectively process such a large amount of data. In recent years, deep convolutional neural networks (DCNNs) have shown great potential and effectiveness in a variety of tasks regarding the automatic processing of bird images. However, no research has been conducted on the recognition of habitat elements in bird images, which is of great help when extracting habitat information from bird images. Here, we demonstrate the recognition of habitat elements using four DCNN models trained end-to-end directly based on images. To carry out this research, an image database called Habitat Elements of Bird Images (HEOBs-10) and composed of 10 categories of habitat elements was built, making future benchmarks and evaluations possible. Experiments showed that good results can be obtained by all the tested models. ResNet-152-based models yielded the best test accuracy rate (95.52%); the AlexNet-based model yielded the lowest test accuracy rate (89.48%). We conclude that DCNNs could be efficient and useful for automatically identifying habitat elements from bird images, and we believe that the practical application of this technology will be helpful for studying the relationships between birds and habitat elements.



Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2003 ◽  
Author(s):  
Xiaoliang Zhu ◽  
Shihao Ye ◽  
Liang Zhao ◽  
Zhicheng Dai

As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment.



2017 ◽  
Vol 126 (2-4) ◽  
pp. 272-291 ◽  
Author(s):  
Jun-Cheng Chen ◽  
Rajeev Ranjan ◽  
Swami Sankaranarayanan ◽  
Amit Kumar ◽  
Ching-Hui Chen ◽  
...  


2019 ◽  
Author(s):  
Jon Garry ◽  
Thomas Trappenberg ◽  
Steven Beyea ◽  
Timothy Bardouille

AbstractConvolutional neural networks were used to classify and analyse a large magnetoencephalography (MEG) dataset. Networks were trained to classify between active and baseline intervals of minimally-processed data recorded during cued button pressing. There were two primary objectives for this study: (1) develop networks that can effectively classify MEG data, and (2) identify the important data features that inform classification. Networks with a simple architecture were trained using sensor and source-localised data. Networks trained with sensor data were also trained using varying amounts of data. The important features within the data were identified via saliency and occlusion mapping. An ensemble of networks trained using sensor data performed best (average test accuracy 0.974 ± 0.001). A dataset containing on the order of hundreds of participants was required for optimal performance of this network with these data. Visualisation maps highlighted features known to occur during neuromagnetic recordings of cued button pressing.



Sign in / Sign up

Export Citation Format

Share Document