Deep Learning based Image Processing for Cashier-less Self-Checkout Methodology

In recent years, shopping experiences are becoming more advanced. These include the attempts of market shelves as well as the currently booming online shopping. Online shopping has a better convenience but not yet accepted on a large scale by many people. Retail shops still retain greater response by the users and thus the retailers are moving towards an attempt of cashier-less shopping. A major problem of retail shops is that the people have crunch-time for shopping and cannot afford the waiting time at the checkout counters. Addressing this problem, we have developed a shopping style which saves time of checkout and also the time of maintaining social distancing queues. This research paper presents a stereo vision-based AI system which is useful to monitor the customers while shopping and also the items which are added or replaced in the virtual cart. The customers can directly walk out of the store after shopping and the final order cost of the shopping will be evaluated. This amount will be charged to the customer’s account. The system makes sure that there are no errors made during the evaluation and there are no charges for products which are not brought home. To achieve all this, the system uses image processing, object detection and face recognition algorithms that are widely practiced at present. The system also uses sensors like RFID tags and pressure sensors for weight measurement and detection of products on the shelves.

Download Full-text

Deep Learning- Based Surveillance System using Face Recognition

ITM Web of Conferences ◽

10.1051/itmconf/20203203011 ◽

2020 ◽

Vol 32 ◽

pp. 03011

Author(s):

Divya Kapil ◽

Aishwarya Kamtam ◽

Akhil Kedare ◽

Smita Bharne

Keyword(s):

Neural Network ◽

Image Processing ◽

Deep Learning ◽

Face Recognition ◽

Convolutional Neural Network ◽

Surveillance System ◽

Surveillance Systems ◽

Learning Method ◽

Security Systems ◽

The Face

Surveillance systems are used for the monitoring the activities directly or indirectly. Most of the surveillance system uses the face recognition techniques to monitor the activities. This system builds the automated contemporary biometric surveillance system based on deep learning. The application of the system can be used in various ways. The face prints of the persons will be stored inside the database with relevant statistics and does the face recognition. When any unknown face is recognized then alarm will ring so one can alert the security systems and in addition actions will be taken. The system learns changes while detecting faces automatically using deep learning and gain correct accuracy in face recognition. A deep learning method including Convolutional Neural Network (CNN) is having great significance in the area of image processing. This system can be applicable to monitor the activities for the housing society premises.

Download Full-text

Road Object Detection: A Comparative Study of Deep Learning-Based Algorithms

Electronics ◽

10.3390/electronics10161932 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1932

Author(s):

Malik Haris ◽

Adam Glowacz

Keyword(s):

Image Processing ◽

Deep Learning ◽

Object Detection ◽

Real Time ◽

Large Scale ◽

Single Shot ◽

Automated Driving ◽

Convolutional Network ◽

Image Processing Algorithms ◽

Processing Algorithms

Automated driving and vehicle safety systems need object detection. It is important that object detection be accurate overall and robust to weather and environmental conditions and run in real-time. As a consequence of this approach, they require image processing algorithms to inspect the contents of images. This article compares the accuracy of five major image processing algorithms: Region-based Fully Convolutional Network (R-FCN), Mask Region-based Convolutional Neural Networks (Mask R-CNN), Single Shot Multi-Box Detector (SSD), RetinaNet, and You Only Look Once v4 (YOLOv4). In this comparative analysis, we used a large-scale Berkeley Deep Drive (BDD100K) dataset. Their strengths and limitations are analyzed based on parameters such as accuracy (with/without occlusion and truncation), computation time, precision-recall curve. The comparison is given in this article helpful in understanding the pros and cons of standard deep learning-based algorithms while operating under real-time deployment restrictions. We conclude that the YOLOv4 outperforms accurately in detecting difficult road target objects under complex road scenarios and weather conditions in an identical testing environment.

Download Full-text

DiPLIP: Distributed Parallel Processing Platform for Stream Image Processing Based on Deep Learning Model Inference

Electronics ◽

10.3390/electronics9101664 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1664

Author(s):

Yoon-Ki Kim ◽

Yongsung Kim

Keyword(s):

Image Processing ◽

Deep Learning ◽

Parallel Processing ◽

Real Time ◽

Large Scale ◽

Learning Model ◽

Streaming Data ◽

Model Inference ◽

Processing Platform ◽

Deep Learning Model

Recently, as the amount of real-time video streaming data has increased, distributed parallel processing systems have rapidly evolved to process large-scale data. In addition, with an increase in the scale of computing resources constituting the distributed parallel processing system, the orchestration of technology has become crucial for proper management of computing resources, in terms of allocating computing resources, setting up a programming environment, and deploying user applications. In this paper, we present a new distributed parallel processing platform for real-time large-scale image processing based on deep learning model inference, called DiPLIP. It provides a scheme for large-scale real-time image inference using buffer layer and a scalable parallel processing environment according to the size of the stream image. It allows users to easily process trained deep learning models for processing real-time images in a distributed parallel processing environment at high speeds, through the distribution of the virtual machine container.

Download Full-text

Image Perturbation-Based Deep Learning for Face Recognition Utilizing Discrete Cosine Transform

Electronics ◽

10.3390/electronics11010025 ◽

2021 ◽

Vol 11 (1) ◽

pp. 25

Author(s):

Jaehun Park ◽

Kwangsu Kim

Keyword(s):

Deep Learning ◽

Face Recognition ◽

Discrete Cosine Transform ◽

Large Scale ◽

User Study ◽

Emotion Classification ◽

Cosine Transform ◽

Privacy Leakage ◽

Attribute Classification ◽

Biometric Information

Face recognition, including emotion classification and face attribute classification, has seen tremendous progress during the last decade owing to the use of deep learning. Large-scale data collected from numerous users have been the driving force in this growth. However, face images containing the identities of the owner can potentially cause severe privacy leakage if linked to other sensitive biometric information. The novel discrete cosine transform (DCT) coefficient cutting method (DCC) proposed in this study combines DCT and pixelization to protect the privacy of the image. However, privacy is subjective, and it is not guaranteed that the transformed image will preserve privacy. To overcome this, a user study was conducted on whether DCC really preserves privacy. To this end, convolutional neural networks were trained for face recognition and face attribute classification tasks. Our survey and experiments demonstrate that a face recognition deep learning model can be trained with images that most people think preserve privacy at a manageable cost in classification accuracy.

Download Full-text

Digital Image Computing: Techniques and Applications

10.1071/9780643090989 ◽

2003 ◽

Cited By ~ 4

Keyword(s):

Image Processing ◽

Image Analysis ◽

Face Recognition ◽

Virtual Environments ◽

Stereo Vision ◽

Digital Image ◽

Research Community ◽

Computer Scientist ◽

Biennial Conference ◽

Imaging Research

Digital Image Computing: Techniques and Applications is the premier biennial conference in Australia on the topics of image processing and image analysis. This seventh edition of the proceedings has seen an unprecedented level of submission, on such diverse areas as: Image processing; Face recognition; Segmentation; Registration; Motion analysis; Medical imaging; Object recognition; Virtual environments; Graphics; Stereo-vision; and Video analysis. These two volumes contain all the 108 accepted papers and five invited talks that were presented at the conference. These two volumes provide the Australian and international imaging research community with a snapshot of current theoretical and practical developments in these areas. They are of value to any engineer, computer scientist, mathematician, statistician or student interested in these matters.

Download Full-text

Face Recognition via Deep Learning Using Data Augmentation Based on Orthogonal Experiments

Electronics ◽

10.3390/electronics8101088 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1088 ◽

Cited By ~ 4

Author(s):

Zhao Pei ◽

Hang Xu ◽

Yanning Zhang ◽

Min Guo ◽

Yee-Hong Yang

Keyword(s):

Deep Learning ◽

Face Recognition ◽

Large Scale ◽

Data Augmentation ◽

Small Sample ◽

Geometric Transformation ◽

Image Brightness ◽

Class Attendance ◽

Orthogonal Experiments ◽

Using Data

Class attendance is an important means in the management of university students. Using face recognition is one of the most effective techniques for taking daily class attendance. Recently, many face recognition algorithms via deep learning have achieved promising results with large-scale labeled samples. However, due to the difficulties of collecting samples, face recognition using convolutional neural networks (CNNs) for daily attendance taking remains a challenging problem. Data augmentation can enlarge the samples and has been applied to the small sample learning. In this paper, we address this problem using data augmentation through geometric transformation, image brightness changes, and the application of different filter operations. In addition, we determine the best data augmentation method based on orthogonal experiments. Finally, the performance of our attendance method is demonstrated in a real class. Compared with PCA and LBPH methods with data augmentation and VGG-16 network, the accuracy of our proposed method can achieve 86.3%. Additionally, after a period of collecting more data, the accuracy improves to 98.1%.

Download Full-text

Location Invariant Animal Recognition Using Mixed Source Datasets and Deep Learning

10.1101/2020.05.13.094896 ◽

2020 ◽

Author(s):

Andrew Shepley ◽

Greg Falzon ◽

Paul Meek ◽

Paul Kwan

Keyword(s):

Image Processing ◽

Deep Learning ◽

Object Detection ◽

Transfer Learning ◽

Large Scale ◽

Camera Trap ◽

List Type ◽

Learning Models ◽

Geographical Regions ◽

Flickr Images

AbstractA time-consuming challenge faced by camera trap practitioners all over the world is the extraction of meaningful data from images to inform ecological management. The primary methods of image processing used by practitioners includes manual analysis and citizen science. An increasingly popular alternative is automated image classification software. However, most automated solutions are not sufficiently robust to be deployed on a large scale. Key challenges include limited access to images for each species and lack of location invariance when transferring models between sites. This prevents optimal use of ecological data and results in significant expenditure of time and resources to annotate and retrain deep learning models.In this study, we aimed to (a) assess the value of publicly available non-iconic FlickR images in the training of deep learning models for camera trap object detection, (b) develop an out-of-the-box location invariant automated camera trap image processing solution for ecologist using deep transfer learning and (c) explore the use of small subsets of camera trap images in optimisation of a FlickR trained deep learning model for high precision ecological object detection.We collected and annotated a dataset of images of “pigs” (Sus scrofa and Phacochoerus africanus) from the consumer image sharing website FlickR. These images were used to achieve transfer learning using a RetinaNet model in the task of object detection. We compared the performance of this model to the performance of models trained on combinations of camera trap images obtained from five different projects, each characterised by 5 different geographical regions. Furthermore, we explored optimisation of the FlickR model via infusion of small subsets of camera trap images to increase robustness in difficult images.In most cases, the mean Average Precision (mAP) of the FlickR trained model when tested on out of sample camera trap sites (67.21-91.92%) was significantly higher than the mAP achieved by models trained on only one geographical location (4.42-90.8%) and rivalled the mAP of models trained on mixed camera trap datasets (68.96-92.75%). The infusion of camera trap images into the FlickR training further improved AP by 5.10-22.32% to 83.60-97.02%.Ecology researchers can use FlickR images in the training of automated deep learning solutions for camera trap image processing to significantly reduce time and resource expenditure by allowing the development of location invariant, highly robust out-of-the-box solutions. This would allow AI technologies to be deployed on a large scale in ecological applications.

Download Full-text

DIGITAL IMAGE PROCESSING AND RECOGNITION USING PYTHON

International Journal of Engineering Applied Sciences and Technology ◽

10.33564/ijeast.2021.v05i10.046 ◽

2021 ◽

Vol 5 (10) ◽

Author(s):

D. Sri Shreya

Keyword(s):

Image Processing ◽

Deep Learning ◽

Face Recognition ◽

Digital Image Processing ◽

Digital Image ◽

Gaussian Function ◽

Numerical Data ◽

Input Image ◽

Image Blurring ◽

Learning Concept

In this project, the primary aim will be the conversion of images into Grayscale in which conversion of pixels to array takes place and apply Blur effect using The Gaussian blur which is a type of image-blurring filter that uses a Gaussian function which also expresses the normal distribution in statistics for calculating the transformation to apply to each pixel in the image. The above two processesare applied to the input images. These two above mentioned processes can be achieved by utilizing the most relevant python libraries and functions, followed by conversion of the digital image to numerical data and then, applying the effects to the image to get back the image with applied effects in it. Face recognition refers to matching a face present in an input image from the training/pre-saved dataset and by applying Deep Learning Concept. This will be achieved by defining a function to read and convert images to data, apply the python function, and then, recreating the image with results.

Download Full-text

Simplifying and Streamlining Large-Scale Materials Image Processing with Wizard-Driven and Scalable Deep Learning

Microscopy and Microanalysis ◽

10.1017/s1431927619002745 ◽

2019 ◽

Vol 25 (S2) ◽

pp. 402-403

Author(s):

Benjamin Provencher ◽

Nicolas Piché ◽

Mike Marsh

Keyword(s):

Image Processing ◽

Deep Learning ◽

Large Scale

Download Full-text

Face Recognition and Emotion Detection

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35698 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 4775-4777

Author(s):

V. J Chaudhari

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Face Recognition ◽

Facial Expression ◽

Video Game ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Emotion Detection ◽

The People ◽

Game Developers

This Face recognition and facial emotion detection is new era of technology. It’s also indirectly defining the level of growth in intelligence, security and copying human emotional behaviour. It is mainly used in market research and testing. Many companies require a good and accurate testing method which contributes to their development by providing the necessary insights and drawing the accurate conclusions. Facial expression recognition technology can be developed through various methods. This technology can be developed by using the deep learning with the convolutional neural network or with inbuilt libraries like deepface. The main objective here is to classify each face based on the emotions shown into seven categories which include Anger, Disgust, Fear, Happiness, Sadness, Surprise and Neutrality. The main objective here in this project is, to read the facial expressions of the people and displaying them the product which helps in determining their interest in it. Facial expression recognition technology can also be used in video game testing. During the video game testing, certain users are asked to play the game for a specified period and their expressions, and their behavior are monitored and analyzed. The game developers usually use the facial expression recognition and get the required insights and draw the conclusions and provide their feedback in the making of the final product. In this project, deep learning with the convolutional neural networks (CNN) approach is used. Neural networks need to be trained with large amounts of data and have a higher computational power [8-11]. It takes more time to train the model.[1]

Download Full-text