Deep Learning Approaches to Pedestrian Detection: State of the Art

2021 ◽  
pp. 301-321
Author(s):  
Kamal Hajari ◽  
Ujwalla Gawande ◽  
Yogesh Golhar
Author(s):  
Jwalin Bhatt ◽  
Khurram Azeem Hashmi ◽  
Muhammad Zeshan Afzal ◽  
Didier Stricker

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2020 ◽  
Vol 6 (10) ◽  
pp. 110
Author(s):  
Francesco Lombardi ◽  
Simone Marinai

Nowadays, deep learning methods are employed in a broad range of research fields. The analysis and recognition of historical documents, as we survey in this work, is not an exception. Our study analyzes the papers published in the last few years on this topic from different perspectives: we first provide a pragmatic definition of historical documents from the point of view of the research in the area, then we look at the various sub-tasks addressed in this research. Guided by these tasks, we go through the different input-output relations that are expected from the used deep learning approaches and therefore we accordingly describe the most used models. We also discuss research datasets published in the field and their applications. This analysis shows that the latest research is a leap forward since it is not the simple use of recently proposed algorithms to previous problems, but novel tasks and novel applications of state of the art methods are now considered. Rather than just providing a conclusive picture of the current research in the topic we lastly suggest some potential future trends that can represent a stimulus for innovative research directions.


Mathematics ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. 2075
Author(s):  
Óscar Apolinario-Arzube ◽  
José Antonio García-Díaz ◽  
José Medina-Moreira ◽  
Harry Luna-Aveiga ◽  
Rafael Valencia-García

Automatic satire identification can help to identify texts in which the intended meaning differs from the literal meaning, improving tasks such as sentiment analysis, fake news detection or natural-language user interfaces. Typically, satire identification is performed by training a supervised classifier for finding linguistic clues that can determine whether a text is satirical or not. For this, the state-of-the-art relies on neural networks fed with word embeddings that are capable of learning interesting characteristics regarding the way humans communicate. However, as far as our knowledge goes, there are no comprehensive studies that evaluate these techniques in Spanish in the satire identification domain. Consequently, in this work we evaluate several deep-learning architectures with Spanish pre-trained word-embeddings and compare the results with strong baselines based on term-counting features. This evaluation is performed with two datasets that contain satirical and non-satirical tweets written in two Spanish variants: European Spanish and Mexican Spanish. Our experimentation revealed that term-counting features achieved similar results to deep-learning approaches based on word-embeddings, both outperforming previous results based on linguistic features. Our results suggest that term-counting features and traditional machine learning models provide competitive results regarding automatic satire identification, slightly outperforming state-of-the-art models.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 73
Author(s):  
Marjan Stoimchev ◽  
Marija Ivanovska ◽  
Vitomir Štruc

In the past few years, there has been a leap from traditional palmprint recognition methodologies, which use handcrafted features, to deep-learning approaches that are able to automatically learn feature representations from the input data. However, the information that is extracted from such deep-learning models typically corresponds to the global image appearance, where only the most discriminative cues from the input image are considered. This characteristic is especially problematic when data is acquired in unconstrained settings, as in the case of contactless palmprint recognition systems, where visual artifacts caused by elastic deformations of the palmar surface are typically present in spatially local parts of the captured images. In this study we address the problem of elastic deformations by introducing a new approach to contactless palmprint recognition based on a novel CNN model, designed as a two-path architecture, where one path processes the input in a holistic manner, while the second path extracts local information from smaller image patches sampled from the input image. As elastic deformations can be assumed to most significantly affect the global appearance, while having a lesser impact on spatially local image areas, the local processing path addresses the issues related to elastic deformations thereby supplementing the information from the global processing path. The model is trained with a learning objective that combines the Additive Angular Margin (ArcFace) Loss and the well-known center loss. By using the proposed model design, the discriminative power of the learned image representation is significantly enhanced compared to standard holistic models, which, as we show in the experimental section, leads to state-of-the-art performance for contactless palmprint recognition. Our approach is tested on two publicly available contactless palmprint datasets—namely, IITD and CASIA—and is demonstrated to perform favorably against state-of-the-art methods from the literature. The source code for the proposed model is made publicly available.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2016 ◽  
Author(s):  
Michael P. Pound ◽  
Alexandra J. Burgess ◽  
Michael H. Wilson ◽  
Jonathan A. Atkinson ◽  
Marcus Griffiths ◽  
...  

AbstractDeep learning is an emerging field that promises unparalleled results on many data analysis problems. We show the success offered by such techniques when applied to the challenging problem of image-based plant phenotyping, and demonstrate state-of-the-art results for root and shoot feature identification and localisation. We predict a paradigm shift in image-based phenotyping thanks to deep learning approaches.


2021 ◽  
pp. 503-514
Author(s):  
Luis-Roberto Jácome-Galarza ◽  
Miguel-Andrés Realpe-Robalino ◽  
Jonathan Paillacho-Corredores ◽  
José-Leonardo Benavides-Maldonado

2021 ◽  
Vol 11 (13) ◽  
pp. 6025
Author(s):  
Han Xie ◽  
Wenqi Zheng ◽  
Hyunchul Shin

Although many deep-learning-based methods have achieved considerable detection performance for pedestrians with high visibility, their overall performances are still far from satisfactory, especially when heavily occluded instances are included. In this research, we have developed a novel pedestrian detector using a deformable attention-guided network (DAGN). Considering that pedestrians may be deformed with occlusions or under diverse poses, we have designed a deformable convolution with an attention module (DCAM) to sample from non-rigid locations, and obtained the attention feature map by aggregating global context information. Furthermore, the loss function was optimized to get accurate detection bounding boxes, by adopting complete-IoU loss for regression, and the distance IoU-NMS was used to refine the predicted boxes. Finally, a preprocessing technique based on tone mapping was applied to cope with the low visibility cases due to poor illumination. Extensive evaluations were conducted on three popular traffic datasets. Our method could decrease the log-average miss rate (MR−2) by 12.44% and 7.8%, respectively, for the heavy occlusion and overall cases, when compared to the published state-of-the-art results of the Caltech pedestrian dataset. Of the CityPersons and EuroCity Persons datasets, our proposed method outperformed the current best results by about 5% in MR−2 for the heavy occlusion cases.


Sign in / Sign up

Export Citation Format

Share Document