Performance Tuning Techniques for Face Detection Algorithms on GPGPU

Face detection algorithms varies in speed and performance on GPUs. Different algorithms can report different speeds on different GPUs that are not governed by linear or nearlinear approximations. This is due to many factors such as register file size, occupancy rate of the GPU, speed of the memory, and speed of double precision processors. This paper studies the most common face detection algorithms LBP and Haar-like and study the bottlenecks associated with deploying both algorithms on different GPU architectures. The study focuses on the bottlenecks and the associated techniques to resolve them based on the different GPUs specifications.

Download Full-text

Comparison of Viola-Jones And Kanade-Lucas-Tomasi Face Detection Algorithms

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.01.20 ◽

2017 ◽

Vol 10 (1) ◽

pp. 151-159

Author(s):

Kamath Aashish ◽

A. Vijayalakshmi

Keyword(s):

Face Detection ◽

Mobile Phones ◽

Detection Algorithm ◽

Digital Cameras ◽

Detection Algorithms ◽

Single Face ◽

Recognition Systems ◽

Cctv Surveillance ◽

Common Face ◽

Detection Technologies

Face detection technologies are used in a large variety of applications like advertising, entertainment, video coding, digital cameras, CCTV surveillance and even in military use. It is especially crucial in face recognition systems. You can’t recognise faces that you can’t detect, right? But a single face detection algorithm won’t work in the same way in every situation. It all comes down to how the algorithm works. For example, the Kanade-Lucas-Tomasi algorithm makes use of spatial common intensity transformation to direct the deep search for the position that shows the best match. It is much faster than other traditional techniques for checking far fewer potential matches between pictures. Similarly, another common face detection algorithm is the Viola-Jones algorithm that is the most widely used face detection algorithm. It is used in most digital cameras and mobile phones to detect faces. It uses cascades to detect edges like the nose, the ears etc. However, if there is a group of people and their faces are close to each other, the algorithm might not work that well as edges tend to overlap in a crowd. It might not detect individual faces. Therefore, in this work, we test both the Viola-Jones and the Kanade-Lucas-Tomasi algorithm for each image to find out which algorithm works best in which scenario.

Download Full-text

Performance evaluation measures for face detection algorithms

Irish Signals and Systems Conference 2004 ◽

10.1049/cp:20040603 ◽

2004 ◽

Author(s):

P. Sharma

Keyword(s):

Performance Evaluation ◽

Face Detection ◽

Evaluation Measures ◽

Detection Algorithms

Download Full-text

Comparative Evaluation of Face Detection Algorithms

2020 16th International Computer Engineering Conference (ICENCO) ◽

10.1109/icenco49778.2020.9357386 ◽

2020 ◽

Author(s):

Ahmed Yamout ◽

Ahmed Abdelmawgood ◽

Ebraam Sadick ◽

Mohamed Naguib

Keyword(s):

Face Detection ◽

Comparative Evaluation ◽

Detection Algorithms

Download Full-text

A Fast and Lightweight Method with Feature Fusion and Multi-Context for Face Detection

Future Internet ◽

10.3390/fi10080080 ◽

2018 ◽

Vol 10 (8) ◽

pp. 80

Author(s):

Lei Zhang ◽

Xiaoli Zhi

Keyword(s):

Face Detection ◽

Graphics Processing Units ◽

High Performance ◽

Feature Fusion ◽

Local Context ◽

Data Set ◽

Global Context ◽

Detection Algorithms ◽

Multi Scale ◽

Benchmark Datasets

Convolutional neural networks (CNN for short) have made great progress in face detection. They mostly take computation intensive networks as the backbone in order to obtain high precision, and they cannot get a good detection speed without the support of high-performance GPUs (Graphics Processing Units). This limits CNN-based face detection algorithms in real applications, especially in some speed dependent ones. To alleviate this problem, we propose a lightweight face detector in this paper, which takes a fast residual network as backbone. Our method can run fast even on cheap and ordinary GPUs. To guarantee its detection precision, multi-scale features and multi-context are fully exploited in efficient ways. Specifically, feature fusion is used to obtain semantic strongly multi-scale features firstly. Then multi-context including both local and global context is added to these multi-scale features without extra computational burden. The local context is added through a depthwise separable convolution based approach, and the global context by a simple global average pooling way. Experimental results show that our method can run at about 110 fps on VGA (Video Graphics Array)-resolution images, while still maintaining competitive precision on WIDER FACE and FDDB (Face Detection Data Set and Benchmark) datasets as compared with its state-of-the-art counterparts.

Download Full-text

Profiling Methodology and Performance Tuning of the Met Office Unified Model for Weather and Climate Simulations

2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum ◽

10.1109/ipdps.2011.283 ◽

2011 ◽

Cited By ~ 2

Author(s):

Peter E. Strazdins ◽

Margaret Kahn ◽

Joerg Henrichs ◽

Tim Pugh ◽

Mike Rezny

Keyword(s):

Unified Model ◽

Performance Tuning ◽

Climate Simulations ◽

Weather And Climate ◽

And Performance

Download Full-text

Programming the Linpack Benchmark for the IBM PowerXCell 8i Processor

Scientific Programming ◽

10.1155/2009/401691 ◽

2009 ◽

Vol 17 (1-2) ◽

pp. 43-57 ◽

Cited By ~ 4

Author(s):

Michael Kistler ◽

John Gunnels ◽

Daniel Brokenshire ◽

Brad Benton

Keyword(s):

High Speed ◽

Double Precision ◽

Data Movement ◽

Processing Elements ◽

Cell Broadband Engine ◽

Design And Implementation ◽

Computational Capability ◽

High Speed Data ◽

Linpack Benchmark ◽

And Performance

In this paper we present the design and implementation of the Linpack benchmark for the IBM BladeCenter QS22, which incorporates two IBM PowerXCell 8i1processors. The PowerXCell 8i is a new implementation of the Cell Broadband Engine™2 architecture and contains a set of special-purpose processing cores known as Synergistic Processing Elements (SPEs). The SPEs can be used as computational accelerators to augment the main PowerPC processor. The added computational capability of the SPEs results in a peak double precision floating point capability of 108.8 GFLOPS. We explain how we modified the standard open source implementation of Linpack to accelerate key computational kernels using the SPEs of the PowerXCell 8i processors. We describe in detail the implementation and performance of the computational kernels and also explain how we employed the SPEs for high-speed data movement and reformatting. The result of these modifications is a Linpack benchmark optimized for the IBM PowerXCell 8i processor that achieves 170.7 GFLOPS on a BladeCenter QS22 with 32 GB of DDR2 SDRAM memory. Our implementation of Linpack also supports clusters of QS22s, and was used to achieve a result of 11.1 TFLOPS on a cluster of 84 QS22 blades. We compare our results on a single BladeCenter QS22 with the base Linpack implementation without SPE acceleration to illustrate the benefits of our optimizations.

Download Full-text

SQL*Net Diagnostics and Performance Tuning

Oracle Internals ◽

10.1201/9780203997536.ch39 ◽

2001 ◽

pp. 475-487

Author(s):

Dmitry Petrov ◽

Serg Shestakov

Keyword(s):

Performance Tuning ◽

And Performance

Download Full-text

An Innovative Face Detection Based on YCgCr Color Space

10.31227/osf.io/x3syk ◽

2018 ◽

Author(s):

Solly Aryza

Keyword(s):

Face Detection ◽

Skin Color ◽

Color Image ◽

Color Space ◽

Gaussian Model ◽

Detection Rates ◽

Detection Algorithms ◽

Lighting Conditions ◽

Wide Range ◽

Human Faces

It is very challenging to recognize a face from an image due to the wide variety of face and the uncertain of face position. The research on detecting human faces in color image and in video sequence has been attracted with more and more people. In this paper, we propose a novel face detection method that achieves better detection rates. The new face detection algorithms based on skin color model in YCgCr chrominance space. Firstly, we build a skin Gaussian model in Cg-Cr color space. Secondly, a calculation of correlation coefficient is performed between the given template and the candidates. Experimental results demonstrate that our system has achieved high detection rates and low false positives over a wide range of facial variations in color, position and varying lighting conditions.

Download Full-text

Emotion Classification with Reduced Feature Set SGDClassifier, Random Forest and Performance Tuning

Communications in Computer and Information Science - Computing Science, Communication and Security ◽

10.1007/978-981-15-6648-6_8 ◽

2020 ◽

pp. 95-108

Author(s):

Kaushika Pal ◽

Biraj V. Patel

Keyword(s):

Random Forest ◽

Performance Tuning ◽

Emotion Classification ◽

And Performance

Download Full-text

A Review of Facial Feature Detection Algorithms

Advances in Face Image Analysis ◽

10.4018/978-1-61520-991-0.ch003 ◽

2010 ◽

pp. 42-61

Author(s):

Stylianos Asteriadis ◽

Nikos Nikolaidis ◽

Ioannis Pitas ◽

...

Keyword(s):

Feature Detection ◽

Performance Metrics ◽

Facial Feature ◽

Research Field ◽

Head Pose Estimation ◽

Expression Recognition ◽

Facial Feature Detection ◽

Detection Algorithms ◽

Active Research ◽

And Performance

Facial feature localization is an important task in numerous applications of face image analysis that include face recognition and verification, facial expression recognition, driver‘s alertness estimation, head pose estimation etc. Thus, the area has been a very active research field for many years and a multitude of methods appear in the literature. Depending on the targeted application, the proposed methods have different characteristics and are designed to perform in different setups. Thus, a method of general applicability seems to be away from the current state of the art. This chapter intends to offer an up-to-date literature review of facial feature detection algorithms. A review of the image databases and performance metrics that are used to benchmark these algorithms is also provided.

Download Full-text