scholarly journals Face recognition for presence system by using residual networks-50 architecture

Author(s):  
Yohanssen Pratama ◽  
Lit Malem Ginting ◽  
Emma Hannisa Laurencia Nainggolan ◽  
Ade Erispra Rismanda

Presence system is a system for recording the individual attendance in the company, school or institution. There are several types presence system, including the manually presence system using signatures, presence system using fingerprints and presence system using face recognition technology. Presence system using face recognition technology is one of presence system that implements biometric system in the process of recording attendance. In this research we used one of the convolutional neural network (CNN) architectures that won the imagenet large scale visual recognition competition (ILSVRC) in 2015, namely the Residual Networks-50 architecture (ResNet-50) for face recognition. Our contribution in this research is to determine effectiveness ResNet architecture with different configuration of hyperparameters. This hyperparameters includes the number of hidden layers, the number of units in the hidden layer, batch size, and learning rate. Because hyperparameter are selected based on how the experiments performed and the value of each hyperparameter affects the final result accuracy, so we try 22 configurations (experiments) to get the best accuracy. We conducted experiments to get the best model with an accuracy of 99%.

2019 ◽  
Vol 9 (20) ◽  
pp. 4397 ◽  
Author(s):  
Soad Almabdy ◽  
Lamiaa Elrefaei

Face recognition (FR) is defined as the process through which people are identified using facial images. This technology is applied broadly in biometrics, security information, accessing controlled areas, keeping of the law by different enforcement bodies, smart cards, and surveillance technology. The facial recognition system is built using two steps. The first step is a process through which the facial features are picked up or extracted, and the second step is pattern classification. Deep learning, specifically the convolutional neural network (CNN), has recently made commendable progress in FR technology. This paper investigates the performance of the pre-trained CNN with multi-class support vector machine (SVM) classifier and the performance of transfer learning using the AlexNet model to perform classification. The study considers CNN architecture, which has so far recorded the best outcome in the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) in the past years, more specifically, AlexNet and ResNet-50. In order to determine performance optimization of the CNN algorithm, recognition accuracy was used as a determinant. Improved classification rates were seen in the comprehensive experiments that were completed on the various datasets of ORL, GTAV face, Georgia Tech face, labelled faces in the wild (LFW), frontalized labeled faces in the wild (F_LFW), YouTube face, and FEI faces. The result showed that our model achieved a higher accuracy compared to most of the state-of-the-art models. An accuracy range of 94% to 100% for models with all databases was obtained. Also, this was obtained with an improvement in recognition accuracy up to 39%.


2018 ◽  
Vol 7 (3.15) ◽  
pp. 95 ◽  
Author(s):  
M Zabir ◽  
N Fazira ◽  
Zaidah Ibrahim ◽  
Nurbaity Sabri

This paper aims to evaluate the accuracy performance of pre-trained Convolutional Neural Network (CNN) models, namely AlexNet and GoogLeNet accompanied by one custom CNN. AlexNet and GoogLeNet have been proven for their good capabilities as these network models had entered ImageNet Large Scale Visual Recognition Challenge (ILSVRC) and produce relatively good results. The evaluation results in this research are based on the accuracy, loss and time taken of the training and validation processes. The dataset used is Caltech101 by California Institute of Technology (Caltech) that contains 101 object categories. The result reveals that custom CNN architecture produces 91.05% accuracy whereas AlexNet and GoogLeNet achieve similar accuracy which is 99.65%. GoogLeNet consistency arrives at an early training stage and provides minimum error function compared to the other two models. 


2020 ◽  
pp. 464-465
Author(s):  
Vijayaganth V ◽  
Naveenkumar M ◽  
Mohan M

The disease in tomato leaves affects the quality and quantity of the crops. To overcome this problem an early diagnosis of diseases will benefit the farmers. This work uses PlantVillage dataset of 9 tomato leaves and fed to AlexNet and VGG16. It focuses on accuracy of the model by using hyperparameters like batch size, learning rate and optimizer.


2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Irfan Nasrullah ◽  
Rila Mandala

In this research, the case of intent classification for Customer Relation Management (CRM) how to handle complaints as a domain to be followed up, where datasets are extracted from the conversation on Twitter. The research objectives support three key findings to comparing the CNNs and BRNNs model to intent recognition by vectorization text: (1) Which architecture performs better (accuracy) depends on how important it is to semantically understand the whole sequence and (2) Learning rate changes performance relatively smoothly, while the optimal result iterated by change hidden size and batch size result in large fluctuations. (3) Last, how word vectorization is able to define sub-domain of the complaints by word vector classification.


Author(s):  
Oleg Kit

Fighting against the COVID-19 pandemic caused by the SARS-CoV-2 virus is one of the most critical challenges facing the global health system today. The possibility to identify the group of persons in the cohort of people under 50 years old, who are sensitive to the COVID-disease by non-invasive methods, is a very perspective approach for estimating the epidemiological state of the human population. The study aimed to identify the features of people's faces with COVID-19 that the most correlate with disease severity could serve as one of these approaches. For this aim, 525 photos of patients' faces with different outcomes of COVID-19 disease were analyzed using the Dlib face recognition convolutional neural network pre-trained for face recognition. Face descriptor vectors were obtained using the convolutional neural network. Facial features were found that predict a person's sensitivity to the SARS-CoV-2 virus (disease severity), and the contribution of each of the features to the risk of developing a severe form of COVID in a person was found. The accuracy of the binary classification of the individual severity of the COVID-19 course using the k-nearest neighbors algorithm on the test dataset was accuracy - 84%, AUC - 0.90.


2020 ◽  
Vol 12 (1) ◽  
pp. 1
Author(s):  
Vivian Alfionita Sutama ◽  
Suryo Adhi Wibowo ◽  
Rissa Rahmania

Nowadays, Artificial Intelligence is one of the most developing technology, especially on Augmented Reality (AR). AR is a technology which connected between real world and virtual in a real time that allows user to interact directly and display it in 3D. AR technology has two methods, that are AR based on marker and AR based on markerless. However, AR based on marker need an object detection system which has high performance as an interaction tools between user and the device. Single shot multibox detector (SSD) is an object detection algorithm that has fast learning computation and good performance. This method is affected by some parameters like number of epoch, learning rate, batch size, step training, etc. However, to create a good system it took a long process such as taking dataset, labelling process, then training and testing models to gain the best performance. In this experiment, we analyze SSD method in AR technology using inception architecture as pre-trained Convolutional neural network (CNN), and then do transfer learning to minimize amount training time. The configuration that used is the number of step training. The result of this experiment gets the best accuracy in 70.17%. Then, the best performance is used as an object detection model for marker’s AR technology.Abstrak Saat ini, Artificial intelligence merupakan teknologi yang sedang berkembang pesat. Salah satunya adalah teknologi Augmented Reality (AR). AR adalah teknologi yang menggabungkan dunia nyata dengan virtual secara real-time dengan interaksi pengguna secara langsung dan menampilkannya dalam bentuk 3D. Teknologi AR ini memiliki dua metode yaitu dengan marker dan markerless. Dalam perkembangannya, AR berbasis marker membutuhkan sistem deteksi objek yang memiliki performa tinggi sebagai alat interaksi antara pengguna dengan perangkatnya. Single shot multibox detector (SSD) merupakan algoritma deteksi objek yang memiliki komputasi pembelajaran dan kinerja yang baik. Metode ini dipengaruhi oleh beberapa parameter seperti jumlah lapisan konvolusi, epoch, learning rate, jumlah batch, step training, dll. Namun, dalam mengimplementasikannya diperlukan proses yang cukup panjang seperti, pengambilan dataset, proses pelabelan, proses pelatihan menggunakan metode SSD, dan melakukan pengujian terhadap beberapa model untuk mencari perfomansi paling baik. Dalam percobaan ini, kami melakukan analisis terhadap metode SSD pada teknologi AR menggunakan arsitektur Inception sebagai pre-trained Convolutional neural network (CNN), kemudian dilakukan transfer learning untuk memperkecil jumlah kelas data pelatihan dan waktu pelatihan data. Konfigurasi yang digunakan berupa jumlah step pada pelatihan. Hasil dari penilitian ini menunjukan akurasi terbaik sebesar 70,17%. Kemudian, perfomansi terbaik digunakan sebagai model deteksi objek untuk marker pada teknologi AR.


2020 ◽  
Vol 2020 (10) ◽  
pp. 181-1-181-7
Author(s):  
Takahiro Kudo ◽  
Takanori Fujisawa ◽  
Takuro Yamaguchi ◽  
Masaaki Ikehara

Image deconvolution has been an important issue recently. It has two kinds of approaches: non-blind and blind. Non-blind deconvolution is a classic problem of image deblurring, which assumes that the PSF is known and does not change universally in space. Recently, Convolutional Neural Network (CNN) has been used for non-blind deconvolution. Though CNNs can deal with complex changes for unknown images, some CNN-based conventional methods can only handle small PSFs and does not consider the use of large PSFs in the real world. In this paper we propose a non-blind deconvolution framework based on a CNN that can remove large scale ringing in a deblurred image. Our method has three key points. The first is that our network architecture is able to preserve both large and small features in the image. The second is that the training dataset is created to preserve the details. The third is that we extend the images to minimize the effects of large ringing on the image borders. In our experiments, we used three kinds of large PSFs and were able to observe high-precision results from our method both quantitatively and qualitatively.


Sign in / Sign up

Export Citation Format

Share Document