scholarly journals Research of Neural Network Approach of Objects Detection in the Images

2020 ◽  
pp. 15-21
Author(s):  
R. N. Kvetny ◽  
R. V. Masliy ◽  
A. M. Kyrylenko ◽  
V. V. Shcherba

The article is devoted to the study of object detection in ima­ges using neural networks. The structure of convolutional neural networks used for image processing is considered. The formation of the convolutional layer (Fig. 1), the sub-sampling layer (Fig. 2) and the fully connected layer (Fig. 3) are described in detail. An overview of popular high-performance convolutional neural network architectures used to detect R-FCN, Yolo, Faster R-CNN, SSD, DetectNet objects has been made. The basic stages of image processing by the DetectNet neural network, which is designed to detect objects in images, are discussed. NVIDIA DIGITS was used to create and train models, and several DetectNet models were trained using this environment. The parameters of experiments (Table 1) and the compari­son of the quality of the trained models (Table 2) are presented. As training and validation data, we used an image of the KITTI database, which was created to improve self-driving systems that do not go without built-in devices, one of which could be the Jetson TX2. KITTI’s images feature several object classes, including cars and pedestrians. Model training and testing was performed using a Jetson TX2 supercomputer. Five models were trained that differed in the Base learning rate parameter. The results obtained make it possible to find a compromise value for the Base learning rate para­meter to quickly obtain a model with a high mAP value. The qua­lity of the best model obtained on the KITTI validation dataset is mAP = 57.8%.

2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
S Gao ◽  
D Stojanovski ◽  
A Parker ◽  
P Marques ◽  
S Heitner ◽  
...  

Abstract Background Correctly identifying views acquired in a 2D echocardiographic examination is paramount to post-processing and quantification steps often performed as part of most clinical workflows. In many exams, particularly in stress echocardiography, microbubble contrast is used which greatly affects the appearance of the cardiac views. Here we present a bespoke, fully automated convolutional neural network (CNN) which identifies apical 2, 3, and 4 chamber, and short axis (SAX) views acquired with and without contrast. The CNN was tested in a completely independent, external dataset with the data acquired in a different country than that used to train the neural network. Methods Training data comprised of 2D echocardiograms was taken from 1014 subjects from a prospective multisite, multi-vendor, UK trial with the number of frames in each view greater than 17,500. Prior to view classification model training, images were processed using standard techniques to ensure homogenous and normalised image inputs to the training pipeline. A bespoke CNN was built using the minimum number of convolutional layers required with batch normalisation, and including dropout for reducing overfitting. Before processing, the data was split into 90% for model training (211,958 frames), and 10% used as a validation dataset (23,946 frames). Image frames from different subjects were separated out entirely amongst the training and validation datasets. Further, a separate trial dataset of 240 studies acquired in the USA was used as an independent test dataset (39,401 frames). Results Figure 1 shows the confusion matrices for both validation data (left) and independent test data (right), with an overall accuracy of 96% and 95% for the validation and test datasets respectively. The accuracy for the non-contrast cardiac views of >99% exceeds that seen in other works. The combined datasets included images acquired across ultrasound manufacturers and models from 12 clinical sites. Conclusion We have developed a CNN capable of automatically accurately identifying all relevant cardiac views used in “real world” echo exams, including views acquired with contrast. Use of the CNN in a routine clinical workflow could improve efficiency of quantification steps performed after image acquisition. This was tested on an independent dataset acquired in a different country to that used to train the model and was found to perform similarly thus indicating the generalisability of the model. Figure 1. Confusion matrices Funding Acknowledgement Type of funding source: Private company. Main funding source(s): Ultromics Ltd.


2018 ◽  
Vol 246 ◽  
pp. 03044 ◽  
Author(s):  
Guozhao Zeng ◽  
Xiao Hu ◽  
Yueyue Chen

Convolutional Neural Networks (CNNs) have become the most advanced algorithms for deep learning. They are widely used in image processing, object detection and automatic translation. As the demand for CNNs continues to increase, the platforms on which they are deployed continue to expand. As an excellent low-power, high-performance, embedded solution, Digital Signal Processor (DSP) is used frequently in many key areas. This paper attempts to deploy the CNN to Texas Instruments (TI)’s TMS320C6678 multi-core DSP and optimize the main operations (convolution) to accommodate the DSP structure. The efficiency of the improved convolution operation has increased by tens of times.


Author(s):  
S O Stepanenko ◽  
P Y Yakimov

Object classification with use of neural networks is extremely current today. YOLO is one of the most often used frameworks for object classification. It produces high accuracy but the processing speed is not high enough especially in conditions of limited performance of a computer. This article researches use of a framework called NVIDIA TensorRT to optimize YOLO with the aim of increasing the image processing speed. Saving efficiency and quality of the neural network work TensorRT allows us to increase the processing speed using an optimization of the architecture and an optimization of calculations on a GPU.


2021 ◽  
Vol 26 (1) ◽  
pp. 200-215
Author(s):  
Muhammad Alam ◽  
Jian-Feng Wang ◽  
Cong Guangpei ◽  
LV Yunrong ◽  
Yuanfang Chen

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1614
Author(s):  
Jonghun Jeong ◽  
Jong Sung Park ◽  
Hoeseok Yang

Recently, the necessity to run high-performance neural networks (NN) is increasing even in resource-constrained embedded systems such as wearable devices. However, due to the high computational and memory requirements of the NN applications, it is typically infeasible to execute them on a single device. Instead, it has been proposed to run a single NN application cooperatively on top of multiple devices, a so-called distributed neural network. In the distributed neural network, workloads of a single big NN application are distributed over multiple tiny devices. While the computation overhead could effectively be alleviated by this approach, the existing distributed NN techniques, such as MoDNN, still suffer from large traffics between the devices and vulnerability to communication failures. In order to get rid of such big communication overheads, a knowledge distillation based distributed NN, called Network of Neural Networks (NoNN), was proposed, which partitions the filters in the final convolutional layer of the original NN into multiple independent subsets and derives smaller NNs out of each subset. However, NoNN also has limitations in that the partitioning result may be unbalanced and it considerably compromises the correlation between filters in the original NN, which may result in an unacceptable accuracy degradation in case of communication failure. In this paper, in order to overcome these issues, we propose to enhance the partitioning strategy of NoNN in two aspects. First, we enhance the redundancy of the filters that are used to derive multiple smaller NNs by means of averaging to increase the immunity of the distributed NN to communication failure. Second, we propose a novel partitioning technique, modified from Eigenvector-based partitioning, to preserve the correlation between filters as much as possible while keeping the consistent number of filters distributed to each device. Throughout extensive experiments with the CIFAR-100 (Canadian Institute For Advanced Research-100) dataset, it has been observed that the proposed approach maintains high inference accuracy (over 70%, 1.53× improvement over the state-of-the-art approach), on average, even when a half of eight devices in a distributed NN fail to deliver their partial inference results.


2018 ◽  
Vol 10 (4) ◽  
pp. 140-155 ◽  
Author(s):  
Lu Liu ◽  
Yao Zhao ◽  
Rongrong Ni ◽  
Qi Tian

This article describes how images could be forged using different techniques, and the most common forgery is copy-move forgery, in which a part of an image is duplicated and placed elsewhere in the same image. This article describes a convolutional neural network (CNN)-based method to accurately localize the tampered regions, which combines color filter array (CFA) features. The CFA interpolation algorithm introduces the correlation and consistency among the pixels, which can be easily destroyed by most image processing operations. The proposed CNN method can effectively distinguish the traces caused by copy-move forgeries and some post-processing operations. Additionally, it can utilize the classification result to guide the feature extraction, which can enhance the robustness of the learned features. This article, per the authors, tests the proposed method in several experiments. The results demonstrate the efficiency of the method on different forgeries and quantifies its robustness and sensitivity.


2000 ◽  
Vol 1719 (1) ◽  
pp. 103-111 ◽  
Author(s):  
Satish C. Sharma ◽  
Pawan Lingras ◽  
Guo X. Liu ◽  
Fei Xu

Estimation of the annual average daily traffic (AADT) for low-volume roads is investigated. Artificial neural networks are compared with the traditional factor approach for estimating AADT from short-period traffic counts. Fifty-five automatic traffic recorder (ATR) sites located on low-volume rural roads in Alberta, Canada, are used as study samples. The results of this study indicate that, when a single 48-h count is used for AADT estimation, the factor approach can yield better results than the neural networks if the ATR sites are grouped appropriately and the sample sites are correctly assigned to various ATR groups. Unfortunately, the current recommended practice offers little guidance on how to achieve the assignment accuracy that may be necessary to obtain reliable AADT estimates from a single 48-h count. The neural network approach can be particularly suitable for estimating AADT from two 48-h counts taken at different times during the counting season. In fact, the 95th percentile error values of about 25 percent as obtained in this study for the neural network models compare favorably with the values reported in the literature for low-volume roads using the traditional factor approach. The advantage of the neural network approach is that classification of ATR sites and sample site assignments to ATR groups are not required. The analysis of various groups of low-volume roads presented also leads to a conclusion that, when defining low-volume roads from a traffic monitoring point of view, it is not likely to matter much whether the AADT on the facility is less than 500 vehicles, less than 750 vehicles, or less than 1,000 vehicles.


Author(s):  
Gerardo Schneider ◽  
Alejandro Javier Hadad ◽  
Alejandra Kemerer

Resumen En este trabajo se presenta una implementación de software para la determinación del estado de plantaciones de caña de azúcar basado en el análisis de imágenes aéreas multiespectrales. En la actualidad no existen técnicas precisas para estimar objetivamente la superficie de caña caída o volcada, y esta ocasiona importantes pérdidas de productividad en la cosecha y en la industrialización. Para la realización de éste trabajo se confeccionó un dataset referencial de imágenes, y se implementó un software a partir del cual se obtuvieron indicadores propuestos como representativos del fenómeno agronómico, y se realizaron análisis de los datos generados. Además se implementó un software clasificador referencial basado en redes neuronales con el que se estimó la fortaleza de dichos indicadores y se estimó la superficie afectada en forma cuantitativa y espacial. Palabras ClavesCaña de azúcar, cuantificación, volcado, red neuronal, procesamiento de imagen   Abstract In this paper we present a software implementation for determining the status of sugarcane plantations based on the analysis of multispectral aerial images. Currently there are no precise techniques to estimate objectively the cane area fall or overturned, and this causes significant losses in crop productivity and industrialization. For the realization of this work a dataset benchmark images was made, and a software, from which were obtained representative proposed indicators for the agronomic phenomenon was implemented, and analyzes of the data generated were realized. In addition, we implemented a software benchmark classifier based on neural networks with which we estimated the strength of these indicators and the area affected was estimated quantitatively and spatially. Keywords Sugarcane, quantification, fall, neural network, image processing


Energies ◽  
2022 ◽  
Vol 15 (2) ◽  
pp. 588
Author(s):  
Felipe Leite Coelho da Silva ◽  
Kleyton da Costa ◽  
Paulo Canas Rodrigues ◽  
Rodrigo Salas ◽  
Javier Linkolk López-Gonzales

Forecasting the industry’s electricity consumption is essential for energy planning in a given country or region. Thus, this study aims to apply time-series forecasting models (statistical approach and artificial neural network approach) to the industrial electricity consumption in the Brazilian system. For the statistical approach, the Holt–Winters, SARIMA, Dynamic Linear Model, and TBATS (Trigonometric Box–Cox transform, ARMA errors, Trend, and Seasonal components) models were considered. For the approach of artificial neural networks, the NNAR (neural network autoregression) and MLP (multilayer perceptron) models were considered. The results indicate that the MLP model was the one that obtained the best forecasting performance for the electricity consumption of the Brazilian industry under analysis.


2021 ◽  
Vol 5 (2) ◽  
pp. 312-318
Author(s):  
Rima Dias Ramadhani ◽  
Afandi Nur Aziz Thohari ◽  
Condro Kartiko ◽  
Apri Junaidi ◽  
Tri Ginanjar Laksana ◽  
...  

Waste is goods / materials that have no value in the scope of production, where in some cases the waste is disposed of carelessly and can damage the environment. The Indonesian government in 2019 recorded waste reaching 66-67 million tons, which is higher than the previous year, which was 64 million tons. Waste is differentiated based on its type, namely organic and anorganic waste. In the field of computer science, the process of sensing the type waste can be done using a camera and the Convolutional Neural Networks (CNN) method, which is a type of neural network that works by receiving input in the form of images. The input will be trained using CNN architecture so that it will produce output that can recognize the object being inputted. This study optimizes the use of the CNN method to obtain accurate results in identifying types of waste. Optimization is done by adding several hyperparameters to the CNN architecture. By adding hyperparameters, the accuracy value is 91.2%. Meanwhile, if the hyperparameter is not used, the accuracy value is only 67.6%. There are three hyperparameters used to increase the accuracy value of the model. They are dropout, padding, and stride. 20% increase in dropout to increase training overfit. Whereas padding and stride are used to speed up the model training process.


Sign in / Sign up

Export Citation Format

Share Document