Position-Aware Recalibration Module: Learning From Feature Semantics and Feature Position

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/111 ◽

2020 ◽

Author(s):

Xu Ma ◽

Song Fu

Keyword(s):

Neural Networks ◽

Image Processing ◽

Positional Information ◽

Validation Dataset ◽

Clear Margin ◽

Computational Overhead ◽

Visual Tasks ◽

Representational Power ◽

Traditional Image ◽

Feature Semantics

We present a new method to improve the representational power of the features in Convolutional Neural Networks (CNNs). By studying traditional image processing methods and recent CNN architectures, we propose to use positional information in CNNs for effective exploration of feature dependencies. Rather than considering feature semantics alone, we incorporate spatial positions as an augmentation for feature semantics in our design. From this vantage, we present a Position-Aware Recalibration Module (PRM in short) which recalibrates features leveraging both feature semantics and position. Furthermore, inspired by multi-head attention, our module is capable of performing multiple recalibrations where results are concatenated as the output. As PRM is efficient and easy to implement, it can be seamlessly integrated into various base networks and applied to many position-aware visual tasks. Compared to original CNNs, our PRM introduces a negligible number of parameters and FLOPs, while yielding better performance. Experimental results on ImageNet and MS COCO benchmarks show that our approach surpasses related methods by a clear margin with less computational overhead. For example, we improve the ResNet50 by absolute 1.75% (77.65% vs. 75.90%) on ImageNet 2012 validation dataset, and 1.5%~1.9% mAP on MS COCO validation dataset with almost no computational overhead. Codes are made publicly available.

Download Full-text

Research of Neural Network Approach of Objects Detection in the Images

Metrology and instruments ◽

10.33955/2307-2180(6)2019.15-21 ◽

2020 ◽

pp. 15-21

Author(s):

R. N. Kvetny ◽

R. V. Masliy ◽

A. M. Kyrylenko ◽

V. V. Shcherba

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Processing ◽

High Performance ◽

Learning Rate ◽

Validation Dataset ◽

Neural Network Approach ◽

Validation Data ◽

Compromise Value ◽

Model Training

The article is devoted to the study of object detection in images using neural networks. The structure of convolutional neural networks used for image processing is considered. The formation of the convolutional layer (Fig. 1), the sub-sampling layer (Fig. 2) and the fully connected layer (Fig. 3) are described in detail. An overview of popular high-performance convolutional neural network architectures used to detect R-FCN, Yolo, Faster R-CNN, SSD, DetectNet objects has been made. The basic stages of image processing by the DetectNet neural network, which is designed to detect objects in images, are discussed. NVIDIA DIGITS was used to create and train models, and several DetectNet models were trained using this environment. The parameters of experiments (Table 1) and the comparison of the quality of the trained models (Table 2) are presented. As training and validation data, we used an image of the KITTI database, which was created to improve self-driving systems that do not go without built-in devices, one of which could be the Jetson TX2. KITTI’s images feature several object classes, including cars and pedestrians. Model training and testing was performed using a Jetson TX2 supercomputer. Five models were trained that differed in the Base learning rate parameter. The results obtained make it possible to find a compromise value for the Base learning rate parameter to quickly obtain a model with a high mAP value. The quality of the best model obtained on the KITTI validation dataset is mAP = 57.8%.

Download Full-text

Efficient algorithms for space image processing and their realization in cellular neural networks

Kosmìčna nauka ì tehnologìâ ◽

10.15407/knit1998.04.074 ◽

1998 ◽

Vol 4 (4) ◽

pp. 74-84

Author(s):

I.N. Aizenberg ◽

Keyword(s):

Neural Networks ◽

Image Processing ◽

Cellular Neural Networks ◽

Efficient Algorithms ◽

Space Image

Download Full-text

CONCENTRATION ESTIMATION IN TWO-DIMENSIONAL BLUFF BODY WAKES USING IMAGE PROCESSING AND NEURAL NETWORKS

Journal of Flow Visualization and Image Processing ◽

10.1615/jflowvisimageproc.v8.i2-3.30 ◽

2001 ◽

Vol 8 (2-3) ◽

pp. 19 ◽

Cited By ~ 1

Author(s):

Murthy Balu ◽

Ram Balachandar ◽

Hugh Wood

Keyword(s):

Neural Networks ◽

Image Processing ◽

Bluff Body ◽

Two Dimensional ◽

Concentration Estimation

Download Full-text

Algorithms for segmentation and recognition of objects on medical images based on chiarlet transformation and neural networks

Informatization and communication ◽

10.34219/2078-8320-2020-11-2-35-45 ◽

2020 ◽

pp. 35-45

Author(s):

Y.A. Hamad ◽

K.V. Simonov ◽

A.S. Kents

Keyword(s):

Neural Networks ◽

Image Processing ◽

Computer Vision ◽

Edge Detection ◽

Medical Images ◽

Classification Algorithms ◽

Visual Data ◽

Recognition Of Objects

The paper considers general approaches to image processing, analysis of visual data and computer vision. The main methods for detecting features and edges associated with these approaches are presented. A brief description of modern edge detection and classification algorithms suitable for isolating and characterizing the type of pathology in the lungs in medical images is also given.

Download Full-text

Vibration Region Analysis for Condition Monitoring of Gearboxes Using Image Processing and Neural Networks

Experimental Techniques ◽

10.1007/s40799-019-00329-9 ◽

2019 ◽

Vol 43 (6) ◽

pp. 739-755 ◽

Cited By ~ 1

Author(s):

B. Hizarci ◽

R.C. Ümütlü ◽

H. Ozturk ◽

Z. Kıral

Keyword(s):

Neural Networks ◽

Image Processing ◽

Condition Monitoring ◽

Region Analysis

Download Full-text

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

Download Full-text

Remote Sensing Image Dataset Expansion Based on Generative Adversarial Networks with Modified Shuffle Attention

Sensors ◽

10.3390/s21144867 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4867

Author(s):

Lu Chen ◽

Hongjun Wang ◽

Xianghao Meng

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Remote Sensing Image ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Evaluation Indexes ◽

Adversarial Network ◽

Remote Sensing Image Processing ◽

Data Expansion

With the development of science and technology, neural networks, as an effective tool in image processing, play an important role in gradual remote-sensing image-processing. However, the training of neural networks requires a large sample database. Therefore, expanding datasets with limited samples has gradually become a research hotspot. The emergence of the generative adversarial network (GAN) provides new ideas for data expansion. Traditional GANs either require a large number of input data, or lack detail in the pictures generated. In this paper, we modify a shuffle attention network and introduce it into GAN to generate higher quality pictures with limited inputs. In addition, we improved the existing resize method and proposed an equal stretch resize method to solve the problem of image distortion caused by different input sizes. In the experiment, we also embed the newly proposed coordinate attention (CA) module into the backbone network as a control test. Qualitative indexes and six quantitative evaluation indexes were used to evaluate the experimental results, which show that, compared with other GANs used for picture generation, the modified Shuffle Attention GAN proposed in this paper can generate more refined and high-quality diversified aircraft pictures with more detailed features of the object under limited datasets.

Download Full-text

Image Preprocessing in Classification and Identification of Diabetic Eye Diseases

Data Science and Engineering ◽

10.1007/s41019-021-00167-z ◽

2021 ◽

Author(s):

Rubina Sarki ◽

Khandakar Ahmed ◽

Hua Wang ◽

Yanchun Zhang ◽

Jiangang Ma ◽

...

Keyword(s):

Image Processing ◽

Image Quality ◽

Geometric Transformation ◽

Diabetic Patients ◽

Fundus Image ◽

Classification Problems ◽

Specificity And Sensitivity ◽

Retinal Fundus Image ◽

Retinal Fundus ◽

Traditional Image

AbstractDiabetic eye disease (DED) is a cluster of eye problem that affects diabetic patients. Identifying DED is a crucial activity in retinal fundus images because early diagnosis and treatment can eventually minimize the risk of visual impairment. The retinal fundus image plays a significant role in early DED classification and identification. An accurate diagnostic model’s development using a retinal fundus image depends highly on image quality and quantity. This paper presents a methodical study on the significance of image processing for DED classification. The proposed automated classification framework for DED was achieved in several steps: image quality enhancement, image segmentation (region of interest), image augmentation (geometric transformation), and classification. The optimal results were obtained using traditional image processing methods with a new build convolution neural network (CNN) architecture. The new built CNN combined with the traditional image processing approach presented the best performance with accuracy for DED classification problems. The results of the experiments conducted showed adequate accuracy, specificity, and sensitivity.

Download Full-text