scholarly journals Neural Attention-driven Non-Maximum Suppression for Person Detection

Author(s):  
Charalampos Symeonidis ◽  
Ioannis Mademlis ◽  
Ioannis Pitas ◽  
Nikos Nikolaidis

Non-maximum suppression (NMS) is a post-processing step in almost every visual object detector. NMS aims to prune the number of overlapping detected candidate regions-of-interest (ROIs) on an image, in order to assign a single and spatially accurate detection to each object. The default NMS algorithm (GreedyNMS) is fairly simple and suffers from severe drawbacks, due to its need for manual tuning. A typical case of failure with high application relevance is pedestrian/person detection in dense human crowds, where GreedyNMS doesn't provide accurate results. This paper proposes an efficient deep neural architecture for NMS in the person detection scenario, by capturing relations of neighbouring ROIs and aiming to ideally assign precisely one detection per person. The presented Seq2Seq-NMS architecture assumes a sequence-to-sequence formulation of the NMS problem, exploits the Multihead Scale-Dot Product Attention mechanism and jointly processes both geometric and visual properties of the input candidate ROIs. Thorough experimental evaluation on three public person detection datasets shows favourable results against competing methods, with acceptable inference runtime requirements and good behaviour for large numbers of raw candidate ROIs per image.

2021 ◽  
Author(s):  
Charalampos Symeonidis ◽  
Ioannis Mademlis ◽  
Ioannis Pitas ◽  
Nikos Nikolaidis

Non-maximum suppression (NMS) is a post-processing step in almost every visual object detector. NMS aims to prune the number of overlapping detected candidate regions-of-interest (ROIs) on an image, in order to assign a single and spatially accurate detection to each object. The default NMS algorithm (GreedyNMS) is fairly simple and suffers from severe drawbacks, due to its need for manual tuning. A typical case of failure with high application relevance is pedestrian/person detection in dense human crowds, where GreedyNMS doesn't provide accurate results. This paper proposes an efficient deep neural architecture for NMS in the person detection scenario, by capturing relations of neighbouring ROIs and aiming to ideally assign precisely one detection per person. The presented Seq2Seq-NMS architecture assumes a sequence-to-sequence formulation of the NMS problem, exploits the Multihead Scale-Dot Product Attention mechanism and jointly processes both geometric and visual properties of the input candidate ROIs. Thorough experimental evaluation on three public person detection datasets shows favourable results against competing methods, with acceptable inference runtime requirements and good behaviour for large numbers of raw candidate ROIs per image.


2020 ◽  
Vol 8 ◽  
pp. 126-137
Author(s):  
Kieran Greer

One of the most fundamental questions in Biology or Artificial Intelligence is how the human brainperforms mathematical functions. How does a neural architecture that may organise itself mostly throughstatistics, know what to do? One possibility is to extract the problem to something more abstract. This becomesclear when thinking about how the brain handles large numbers, for example to the power of something, whensimply summing to an answer is not feasible. In this paper, the author suggests that the maths question can beanswered more easily if the problem is changed into one of symbol manipulation and not just number counting.If symbols can be compared and manipulated, maybe without understanding completely what they are, then themathematical operations become relative and some of them might even be rote learned. The proposed systemmay also be suggested as an alternative to the traditional computer binary system. Any of the actual maths stillbreaks down into binary operations, while a more symbolic level above that can manipulate the numbers andreduce the problem size, thus making the binary operations simpler. An interesting result of looking at this is thepossibility of a new fractal equation resulting from division, that can be used as a measure of good fit and wouldhelp the brain decide how to solve something through self-replacement and a comparison with this good fit.


2016 ◽  
Vol 28 (5) ◽  
pp. 680-692 ◽  
Author(s):  
Daria Proklova ◽  
Daniel Kaiser ◽  
Marius V. Peelen

Objects belonging to different categories evoke reliably different fMRI activity patterns in human occipitotemporal cortex, with the most prominent distinction being that between animate and inanimate objects. An unresolved question is whether these categorical distinctions reflect category-associated visual properties of objects or whether they genuinely reflect object category. Here, we addressed this question by measuring fMRI responses to animate and inanimate objects that were closely matched for shape and low-level visual features. Univariate contrasts revealed animate- and inanimate-preferring regions in ventral and lateral temporal cortex even for individually matched object pairs (e.g., snake–rope). Using representational similarity analysis, we mapped out brain regions in which the pairwise dissimilarity of multivoxel activity patterns (neural dissimilarity) was predicted by the objects' pairwise visual dissimilarity and/or their categorical dissimilarity. Visual dissimilarity was measured as the time it took participants to find a unique target among identical distractors in three visual search experiments, where we separately quantified overall dissimilarity, outline dissimilarity, and texture dissimilarity. All three visual dissimilarity structures predicted neural dissimilarity in regions of visual cortex. Interestingly, these analyses revealed several clusters in which categorical dissimilarity predicted neural dissimilarity after regressing out visual dissimilarity. Together, these results suggest that the animate–inanimate organization of human visual cortex is not fully explained by differences in the characteristic shape or texture properties of animals and inanimate objects. Instead, representations of visual object properties and object category may coexist in more anterior parts of the visual system.


2012 ◽  
Vol 2012 ◽  
pp. 1-14 ◽  
Author(s):  
Hongyu Hu ◽  
Zhaowei Qu ◽  
Zhihui Li ◽  
Jinhui Hu ◽  
Fulu Wei

A fast pedestrian recognition algorithm based on multisensor fusion is presented in this paper. Firstly, potential pedestrian locations are estimated by laser radar scanning in the world coordinates, and then their corresponding candidate regions in the image are located by camera calibration and the perspective mapping model. For avoiding time consuming in the training and recognition process caused by large numbers of feature vector dimensions, region of interest-based integral histograms of oriented gradients (ROI-IHOG) feature extraction method is proposed later. A support vector machine (SVM) classifier is trained by a novel pedestrian sample dataset which adapt to the urban road environment for online recognition. Finally, we test the validity of the proposed approach with several video sequences from realistic urban road scenarios. Reliable and timewise performances are shown based on our multisensor fusing method.


Sci ◽  
2020 ◽  
Vol 2 (1) ◽  
pp. 13 ◽  
Author(s):  
Marios Zachariou ◽  
Neofytos Dimitriou ◽  
Ognjen Arandjelović

In this paper, our goal is to perform a virtual restoration of an ancient coin from its image. The present work is the first one to propose this problem, and it is motivated by two key promising applications. The first of these emerges from the recently recognised dependence of automatic image based coin type matching on the condition of the imaged coins; the algorithm introduced herein could be used as a pre-processing step, aimed at overcoming the aforementioned weakness. The second application concerns the utility both to professional and hobby numismatists of being able to visualise and study an ancient coin in a state closer to its original (minted) appearance. To address the conceptual problem at hand, we introduce a framework which comprises a deep learning based method using Generative Adversarial Networks, capable of learning the range of appearance variation of different semantic elements artistically depicted on coins, and a complementary algorithm used to collect, correctly label, and prepare for processing a large numbers of images (here 100,000) of ancient coins needed to facilitate the training of the aforementioned learning method. Empirical evaluation performed on a withheld subset of the data demonstrates extremely promising performance of the proposed methodology and shows that our algorithm correctly learns the spectra of appearance variation across different semantic elements, and despite the enormous variability present reconstructs the missing (damaged) detail while matching the surrounding semantic content and artistic style.


2021 ◽  
Vol 2135 (1) ◽  
pp. 012001
Author(s):  
Angie Alonso ◽  
Andres Peña ◽  
Fredy Martínez

Abstract The rapid spread of the SARS-CoV-2 virus has highlighted many social interaction problems that favor the spread of disease, particularly airborne spread, which can be addressed by adjusting existing systems. Of particular interest are places where large numbers of people interact, as they become a focus for the spread of these diseases. This paper proposes and evaluates an autonomous identification scheme for certain surfaces considered high risk due to their continuous handling. These high-contact surfaces can be identified by an autonomous system to apply specific cleaning tasks to them. We evaluate three convolutional models from a proprietary dataset with a total of 2000 images ranging from wall switches to water dispensers. The objective is to identify the ideal architecture for the system. The ResNet (Residual Neural Network), DenseNet (Dense Convolutional Network), and NASNet (Neural Architecture Search Network) models were selected due to their high performance reported in the literature. The models are evaluated with specialized metrics in non-binary classification problems, and the best scheme is selected for prototype development.


Sci ◽  
2020 ◽  
Vol 2 (3) ◽  
pp. 52
Author(s):  
Marios Zachariou ◽  
Neofytos Dimitriou ◽  
Ognjen Arandjelović

In this paper, our goal is to perform a virtual restoration of an ancient coin from its image. The present work is the first one to propose this problem, and it is motivated by two key promising applications. The first of these emerges from the recently recognised dependence of automatic image based coin type matching on the condition of the imaged coins; the algorithm introduced herein could be used as a pre-processing step, aimed at overcoming the aforementioned weakness. The second application concerns the utility both to professional and hobby numismatists of being able to visualise and study an ancient coin in a state closer to its original (minted) appearance. To address the conceptual problem at hand, we introduce a framework which comprises a deep learning based method using Generative Adversarial Networks, capable of learning the range of appearance variation of different semantic elements artistically depicted on coins, and a complementary algorithm used to collect, correctly label, and prepare for processing a large numbers of images (here 100,000) of ancient coins needed to facilitate the training of the aforementioned learning method. Empirical evaluation performed on a withheld subset of the data demonstrates extremely promising performance of the proposed methodology and shows that our algorithm correctly learns the spectra of appearance variation across different semantic elements, and despite the enormous variability present reconstructs the missing (damaged) detail while matching the surrounding semantic content and artistic style.


2002 ◽  
Vol 712 ◽  
Author(s):  
Robert H. Tykot

ABSTRACTGeochemical fingerprinting of obsidian sources was first applied in the Mediterranean region nearly four decades ago. Since then, a number of analytical methods (e.g. INAA, XRF, SEM/Microprobe, ICP-MS) have proven successful in distinguishing the Mediterranean island sources of Giali, Lipari, Melos, Palmarola, Pantelleria, and Sardinia. Moreover, recent geoarchaeological surveys of the central Mediterranean sources have resulted in the more precise location and documentation of each obsidian flow or outcrop, and multiple chemical groups have been identified on at least three of the islands. The ability to specifically attribute artifacts to one of at least five obsidian flows in the Monte Arci region of Sardinia, for example, has enabled the study of specific patterns of source exploitation and the trade mechanisms which resulted in the distribution of obsidian hundreds of kilometers away during the Neolithic period (ca. 6000-3000 BC).Results are presented here from the chemical analysis of 186 artifacts from several sites in Sardinia and Corsica as part of the largest and most comprehensive study of obsidian sources and trade in the Mediterranean. Analyses of large numbers of artifacts demonstrate the differential use of island subsources, which may be attributed to factors such as access (e.g. topography, distance from coast), size and frequency of nodules, and mechanical and visual properties. The patterns of source exploitation revealed by this study specifically support a down-the-line model of obsidian trade during the neolithic period. In addition, the spatial and chronological patterns of obsidian distribution may be used to address such issues as the colonization of the islands; the introduction of neolithic economies; and the increasing social complexity of neolithic and bronze age societies in the central Mediterranean.


Author(s):  
T. G. Merrill ◽  
B. J. Payne ◽  
A. J. Tousimis

Rats given SK&F 14336-D (9-[3-Dimethylamino propyl]-2-chloroacridane), a tranquilizing drug, developed an increased number of vacuolated lymphocytes as observed by light microscopy. Vacuoles in peripheral blood of rats and humans apparently are rare and are not usually reported in differential counts. Transforming agents such as phytohemagglutinin and pokeweed mitogen induce similar vacuoles in in vitro cultures of lymphocytes. These vacuoles have also been reported in some of the lipid-storage diseases of humans such as amaurotic familial idiocy, familial neurovisceral lipidosis, lipomucopolysaccharidosis and sphingomyelinosis. Electron microscopic studies of Tay-Sachs' disease and of chloroquine treated swine have demonstrated large numbers of “membranous cytoplasmic granules” in the cytoplasm of neurons, in addition to lymphocytes. The present study was undertaken with the purpose of characterizing the membranous inclusions and developing an experimental animal model which may be used for the study of lipid storage diseases.


Sign in / Sign up

Export Citation Format

Share Document