Learning to share visual appearance for multiclass object detection

Author(s):  
Ruslan Salakhutdinov ◽  
Antonio Torralba ◽  
Josh Tenenbaum
2021 ◽  
Vol 13 (13) ◽  
pp. 2623
Author(s):  
Yongbin Zheng ◽  
Peng Sun ◽  
Zongtan Zhou ◽  
Wanying Xu ◽  
Qiang Ren

The detection of arbitrary-oriented and multi-scale objects in satellite optical imagery is an important task in remote sensing and computer vision. Despite significant research efforts, such detection remains largely unsolved due to the diversity of patterns in orientation, scale, aspect ratio, and visual appearance; the dense distribution of objects; and extreme imbalances in categories. In this paper, we propose an adaptive dynamic refined single-stage transformer detector to address the aforementioned challenges, aiming to achieve high recall and speed. Our detector realizes rotated object detection with RetinaNet as the baseline. Firstly, we propose a feature pyramid transformer (FPT) to enhance feature extraction of the rotated object detection framework through a feature interaction mechanism. This is beneficial for the detection of objects with diverse patterns in terms of scale, aspect ratio, visual appearance, and dense distributions. Secondly, we design two special post-processing steps for rotated objects with arbitrary orientations, large aspect ratios and dense distributions. The output features of FPT are fed into post-processing steps. In the first step, it performs the preliminary regression of locations and angle anchors for the refinement step. In the refinement step, it performs adaptive feature refinement first and then gives the final object detection result precisely. The main architecture of the refinement step is dynamic feature refinement (DFR), which is proposed to adaptively adjust the feature map and reconstruct a new feature map for arbitrary-oriented object detection to alleviate the mismatches between rotated bounding boxes and axis-aligned receptive fields. Thirdly, the focus loss is adopted to deal with the category imbalance problem. Experiments on two challenging satellite optical imagery public datasets, DOTA and HRSC2016, demonstrate that the proposed ADT-Det detector achieves a state-of-the-art detection accuracy (79.95% mAP for DOTA and 93.47% mAP for HRSC2016) while running very fast (14.6 fps with a 600 × 600 input image size).


Author(s):  
Кonstantin А. Elshin ◽  
Еlena I. Molchanova ◽  
Мarina V. Usoltseva ◽  
Yelena V. Likhoshway

Using the TensorFlow Object Detection API, an approach to identifying and registering Baikal diatom species Synedra acus subsp. radians has been tested. As a result, a set of images was formed and training was conducted. It is shown that аfter 15000 training iterations, the total value of the loss function was obtained equal to 0,04. At the same time, the classification accuracy is equal to 95%, and the accuracy of construction of the bounding box is also equal to 95%.


2010 ◽  
Vol 130 (9) ◽  
pp. 1572-1580
Author(s):  
Dipankar Das ◽  
Yoshinori Kobayashi ◽  
Yoshinori Kuno

2020 ◽  
Vol 2020 (16) ◽  
pp. 41-1-41-7
Author(s):  
Orit Skorka ◽  
Paul J. Kane

Many of the metrics developed for informational imaging are useful in automotive imaging, since many of the tasks – for example, object detection and identification – are similar. This work discusses sensor characterization parameters for the Ideal Observer SNR model, and elaborates on the noise power spectrum. It presents cross-correlation analysis results for matched-filter detection of a tribar pattern in sets of resolution target images that were captured with three image sensors over a range of illumination levels. Lastly, the work compares the crosscorrelation data to predictions made by the Ideal Observer Model and demonstrates good agreement between the two methods on relative evaluation of detection capabilities.


2017 ◽  
Vol 2 (1) ◽  
pp. 80-87
Author(s):  
Puyda V. ◽  
◽  
Stoian. A.

Detecting objects in a video stream is a typical problem in modern computer vision systems that are used in multiple areas. Object detection can be done on both static images and on frames of a video stream. Essentially, object detection means finding color and intensity non-uniformities which can be treated as physical objects. Beside that, the operations of finding coordinates, size and other characteristics of these non-uniformities that can be used to solve other computer vision related problems like object identification can be executed. In this paper, we study three algorithms which can be used to detect objects of different nature and are based on different approaches: detection of color non-uniformities, frame difference and feature detection. As the input data, we use a video stream which is obtained from a video camera or from an mp4 video file. Simulations and testing of the algoritms were done on a universal computer based on an open-source hardware, built on the Broadcom BCM2711, quad-core Cortex-A72 (ARM v8) 64-bit SoC processor with frequency 1,5GHz. The software was created in Visual Studio 2019 using OpenCV 4 on Windows 10 and on a universal computer operated under Linux (Raspbian Buster OS) for an open-source hardware. In the paper, the methods under consideration are compared. The results of the paper can be used in research and development of modern computer vision systems used for different purposes. Keywords: object detection, feature points, keypoints, ORB detector, computer vision, motion detection, HSV model color


Author(s):  
Ashish Dwivedi ◽  
Nirupma Tiwari

Image enhancement (IE) is very important in the field where visual appearance of an image is the main. Image enhancement is the process of improving the image in such a way that the resulting or output image is more suitable than the original image for specific task. With the help of image enhancement process the quality of image can be improved to get good quality images so that they can be clear for human perception or for the further analysis done by machines.Image enhancement method enhances the quality, visual appearance, improves clarity of images, removes blurring and noise, increases contrast and reveals details. The aim of this paper is to study and determine limitations of the existing IE techniques. This paper will provide an overview of different IE techniques commonly used. We Applied DWT on original RGB image then we applied FHE (Fuzzy Histogram Equalization) after DWT we have done the wavelet shrinkage on Three bands (LH, HL, HH). After that we fuse the shrinkage image and FHE image together and we get the enhance image.


Sign in / Sign up

Export Citation Format

Share Document