scholarly journals Deep heterogeneous superpixel neural networks for image analysis and feature extraction

2021 ◽  
Author(s):  
◽  
Zhangwei (Alex) Yang

Lately, deep convolutional neural networks are rapidly transforming and enhancing computer vision accuracy and performance, and pursuing higher-level and interpretable object recognition. Superpixel-based methodologies have been used in conventional computer vision research where their efficient representation has superior effects. In contemporary computer vision research driven by deep neural networks, superpixel-based approaches mainly rely on oversegmentation to provide a more efficient representation of the imagery data, especially when the computation is too expensive in time or memory to perform in pairwise similarity regularization or complex graphical probabilistic inference. In this dissertation, we proposed a novel superpixel-enabled deep neural network paradigm by relaxing some of the prior assumptions in the conventional superpixel-based methodologies and exploring its capabilities in the context of advanced deep convolutional neural networks. This produces novel neural network architectures that can achieve higher-level object relation modeling, weakly supervised segmentation, high explainability, and facilitate insightful visualizations. This approach has the advantage of being an efficient representation of the visual signal and has the capability to dissect out relevant object components from other background noise by spatially re-organizing visual features. Specifically, we have created superpixel models that join graphical neural network techniques and multiple-instance learning to achieve weakly supervised object detection and generate precise object bounding without pixel-level training labels. This dissection and the subsequent learning by the architecture promotes explainable models, whereby the human users of the models can see the parts of the objects that have led to recognition. Most importantly, this neural design's natural result goes beyond abstract rectangular bounds of an object occurrence (e.g., bounding box or image chip), but instead approaches efficient parts-based segmented recognition. It has been tested on commercial remote sensing satellite imagery and achieved success. Additionally, We have developed highly efficient monocular indoor depth estimation based on superpixel feature extraction. Furthermore, we have demonstrated state-of-theart weakly supervised object detection performance on two contemporary benchmark data sets, MS-COCO and VOC 2012. In the future, deep learning techniques based on superpixel-enabled image analysis can be further optimized in accuracy and computational performance; and it will also be interesting to evaluate in other research domains, such as those involving medical imagery, infrared imagery, or hyperspectral imagery.

2020 ◽  
Vol 226 ◽  
pp. 02020
Author(s):  
Alexey V. Stadnik ◽  
Pavel S. Sazhin ◽  
Slavomir Hnatic

The performance of neural networks is one of the most important topics in the field of computer vision. In this work, we analyze the speed of object detection using the well-known YOLOv3 neural network architecture in different frameworks under different hardware requirements. We obtain results, which allow us to formulate preliminary qualitative conclusions about the feasibility of various hardware scenarios to solve tasks in real-time environments.


Author(s):  
Н.А. Полковникова ◽  
Е.В. Тузинкевич ◽  
А.Н. Попов

В статье рассмотрены технологии компьютерного зрения на основе глубоких свёрточных нейронных сетей. Применение нейронных сетей особенно эффективно для решения трудно формализуемых задач. Разработана архитектура свёрточной нейронной сети применительно к задаче распознавания и классификации морских объектов на изображениях. В ходе исследования выполнен ретроспективный анализ технологий компьютерного зрения и выявлен ряд проблем, связанных с применением нейронных сетей: «исчезающий» градиент, переобучение и вычислительная сложность. При разработке архитектуры нейросети предложено использовать функцию активации RELU, обучение некоторых случайно выбранных нейронов и нормализацию с целью упрощения архитектуры нейросети. Сравнение используемых в нейросети функций активации ReLU, LeakyReLU, Exponential ReLU и SOFTMAX выполнено в среде Matlab R2020a. На основе свёрточной нейронной сети разработана программа на языке программирования Visual C# в среде MS Visual Studio для распознавания морских объектов. Программапредназначена для автоматизированной идентификации морских объектов, производит детектирование (нахождение объектов на изображении) и распознавание объектов с высокой вероятностью обнаружения. The article considers computer vision technologies based on deep convolutional neural networks. Application of neural networks is particularly effective for solving difficult formalized problems. As a result convolutional neural network architecture to the problem of recognition and classification of marine objects on images is implemented. In the research process a retrospective analysis of computer vision technologies was performed and a number of problems associated with the use of neural networks were identified: vanishing gradient, overfitting and computational complexity. To solve these problems in neural network architecture development, it was proposed to use RELU activation function, training some randomly selected neurons and normalization for simplification of neural network architecture. Comparison of ReLU, LeakyReLU, Exponential ReLU, and SOFTMAX activation functions used in the neural network implemented in Matlab R2020a.The computer program based on convolutional neural network for marine objects recognition implemented in Visual C# programming language in MS Visual Studio integrated development environment. The program is designed for automated identification of marine objects, produces detection (i.e., presence of objects on image), and objects recognition with high probability of detection.


Doklady BGUIR ◽  
2020 ◽  
Vol 18 (2) ◽  
pp. 62-70
Author(s):  
N. A. Iskra

This paper suggests an approach to the semantic image analysis for application in computer vision systems. The aim of the work is to develop a method for automatically construction of a semantic model, that formalizes the spatial relationships between objects in the image and research thereof. A distinctive feature of this model is the detection of salient objects, due to which the construction algorithm analyzes significantly less relations between objects, which can greatly reduce the image processing time and the amount of resources spent for processing. Attention is paid to the selection of a neural network algorithm for object detection in an image, as a preliminary stage of model construction. Experiments were conducted on test datasets provided by Visual Genome database, developed by researchers from Stanford University to evaluate object detection algorithms, image captioning models, and other relevant image analysis tasks. When assessing the performance of the model, the accuracy of spatial relations recognition was evaluated. Further, the experiments on resulting model interpretation were conducted, namely image annotation, i.e. generating a textual description of the image content. The experimental results were compared with similar results obtained by means of the algorithm based on neural networks algorithm on the same dataset by other researchers, as well as by the author of this paper earlier. Up to 60 % improvement in image captioning quality (according to the METEOR metric) compared with neural network methods has been shown. In addition, the use of this model allows partial cleansing and normalization of data for training neural network architectures, which are widely used in image analysis among others. The prospects of using this technique in situational monitoring are considered. The disadvantages of this approach are some simplifications in the construction of the model, which will be taken into account in the further development of the model.


Author(s):  
Aleksei Aleksandrovich Rumyantsev ◽  
Farkhad Mansurovich Bikmuratov ◽  
Nikolai Pavlovich Pashin

The subject of this research is medical chest X-ray images. After fundamental pre-processing, the accumulated database of such images can be used for training deep convolutional neural networks that have become one of the most significant innovations in recent years. The trained network carries out preliminary binary classification of the incoming images and serve as an assistant to the radiotherapist. For this purpose, it is necessary to train the neural network to carefully minimize type I and type II errors. Possible approach towards improving the effectiveness of application of neural networks, by the criteria of reducing computational complexity and quality of image classification, is the auxiliary approaches: image pre-processing and preliminary calculation of entropy of the fragments. The article provides the algorithm for X-ray image pre-processing, its fragmentation, and calculation of the entropy of separate fragments. In the course of pre-processing, the region of lungs and spine is selected, which comprises approximately 30-40% of the entire image. Then the image is divided into the matrix of fragments, calculating the entropy of separate fragments in accordance with Shannon’s formula based pm the analysis of individual pixels. Determination of the rate of occurrence of each of the 255 colors allows calculating the total entropy. The use of entropy for detecting pathologies is based on the assumption that its values differ for separate fragments and overall picture of its distribution between the images with the norm and pathologies. The article analyzes the statistical values: standard deviation of error, dispersion. A fully connected neural network is used for determining the patterns in distribution of entropy and its statistical characteristics on various fragments of the chest X-ray image.


2020 ◽  
Vol 10 (6) ◽  
pp. 2104
Author(s):  
Michał Tomaszewski ◽  
Paweł Michalski ◽  
Jakub Osuchowski

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.


2021 ◽  
Author(s):  
Shima Baniadamdizaj ◽  
Mohammadreza Soheili ◽  
Azadeh Mansouri

Abstract Today integration of facts from virtual and paper files may be very vital for the expertise control of efficient. This calls for the record to be localized at the photograph. Several strategies had been proposed to resolve this trouble; however, they may be primarily based totally on conventional photograph processing strategies that aren't sturdy to intense viewpoints and backgrounds. Deep Convolutional Neural Networks (CNNs), on the opposite hand, have demonstrated to be extraordinarily sturdy to versions in history and viewing attitude for item detection and classification responsibilities. We endorse new utilization of Neural Networks (NNs) for the localization trouble as a localization trouble. The proposed technique ought to even localize photos that don't have a very square shape. Also, we used a newly accrued dataset that has extra tough responsibilities internal and is in the direction of a slipshod user. The end result knowledgeable in 3 exclusive classes of photos and our proposed technique has 83% on average. The end result is as compared with the maximum famous record localization strategies and cell applications.


2003 ◽  
Vol 15 (3) ◽  
pp. 278-285
Author(s):  
Daigo Misaki ◽  
◽  
Shigeru Aomura ◽  
Noriyuki Aoyama

We discuss effective pattern recognition for contour images by hierarchical feature extraction. When pattern recognition is done for an unlimited object, it is effective to see the object in a perspective manner at the beginning and next to see in detail. General features are used for rough classification and local features are used for a more detailed classification. D-P matching is applied for classification of a typical contour image of individual class, which contains selected points called ""landmark""s, and rough classification is done. Features between these landmarks are analyzed and used as input data of neural networks for more detailed classification. We apply this to an illustrated referenced book of insects in which much information is classified hierarchically to verify the proposed method. By introducing landmarks, a neural network can be used effectively for pattern recognition of contour images.


Sign in / Sign up

Export Citation Format

Share Document