scholarly journals Using high-performance deep learning platform to accelerate object detection

Author(s):  
S O Stepanenko ◽  
P Y Yakimov

Object classification with use of neural networks is extremely current today. YOLO is one of the most often used frameworks for object classification. It produces high accuracy but the processing speed is not high enough especially in conditions of limited performance of a computer. This article researches use of a framework called NVIDIA TensorRT to optimize YOLO with the aim of increasing the image processing speed. Saving efficiency and quality of the neural network work TensorRT allows us to increase the processing speed using an optimization of the architecture and an optimization of calculations on a GPU.

2021 ◽  
Vol 3 (1) ◽  
pp. 8-14
Author(s):  
D. V. Fedasyuk ◽  
◽  
T. V. Demianets ◽  

A melanoma is the deadliest skin cancer, so early diagnosis can provide a positive prognosis for treatment. Modern methods for early detecting melanoma on the image of the tumor are considered, and their advantages and disadvantages are analyzed. The article demonstrates a prototype of a mobile application for the detection of melanoma on the image of a mole based on a convolutional neural network, which is developed for the Android operating system. The mobile application contains melanoma detection functions, history of the previous examinations and a gallery with images of the previous examinations grouped by the location of the lesion. The HAM10000-based training dataset has been supplemented with the images of melanoma from the archive of The International Skin Imaging Collaboration to eliminate class imbalances and improve network accuracy. The search for existing neural networks that provide high accuracy was conducted, and VGG16, MobileNet, and NASNetMobile neural networks have been selected for research. Transfer learning and fine-tuning has been applied to the given neural networks to adapt the networks for the task of skin lesion classification. It is established that the use of these techniques allows to obtain high accuracy of the neural network for this task. The process of converting a convolutional neural network to an optimized Flatbuffer format using TensorFlow Lite for placement and use on a mobile device is described. The performance characteristics of the selected neural networks on the mobile device are evaluated according to the classification time on the CPU and GPU and the amount of memory occupied by the file of a single network is compared. The neural network file size was compared before and after conversion. It has been shown that the use of the TensorFlow Lite converter significantly reduces the file size of the neural network without affecting its accuracy by using an optimized format. The results of the study indicate a high speed of application and compactness of networks on the device, and the use of graphical acceleration can significantly decrease the image classification time of the tumor. According to the analyzed parameters, NASNetMobile was selected as the optimal neural network to be used in the mobile application of melanoma detection.


2018 ◽  
Vol 246 ◽  
pp. 03044 ◽  
Author(s):  
Guozhao Zeng ◽  
Xiao Hu ◽  
Yueyue Chen

Convolutional Neural Networks (CNNs) have become the most advanced algorithms for deep learning. They are widely used in image processing, object detection and automatic translation. As the demand for CNNs continues to increase, the platforms on which they are deployed continue to expand. As an excellent low-power, high-performance, embedded solution, Digital Signal Processor (DSP) is used frequently in many key areas. This paper attempts to deploy the CNN to Texas Instruments (TI)’s TMS320C6678 multi-core DSP and optimize the main operations (convolution) to accommodate the DSP structure. The efficiency of the improved convolution operation has increased by tens of times.


Author(s):  
Juan D Pineda-Jaramillo ◽  
Ricardo Insa ◽  
Pablo Martínez

This paper presents the training of a neural network using consumption data measured in the underground network of Valencia (Spain), with the objective of estimating the energy consumption of the systems. After the calibration and validation of the neural network using part of the gathered consumption data, the results obtained show that the neural network is capable of predicting power consumption with high accuracy. Once fully trained, the network can be used to study the energy consumption of a metro system and for testing the hypothetical operation scenarios.


2021 ◽  
Vol 25 (3) ◽  
pp. 31-35
Author(s):  
Piotr Więcek ◽  
Dominik Sankowski

The article presents a new algorithm for increasing the resolution of thermal images. For this purpose, the residual network was integrated with the Kernel-Sharing Atrous Convolution (KSAC) image sub-sampling module. A significant reduction in the algorithm’s complexity and shortening the execution time while maintaining high accuracy were achieved. The neural network has been implemented in the PyTorch environment. The results of the proposed new method of increasing the resolution of thermal images with sizes 32 × 24, 160 × 120 and 640 × 480 for scales up to 6 are presented.


Information ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 329
Author(s):  
Jesús Calvillo ◽  
Harm Brouwer ◽  
Matthew W. Crocker

Decades of studies trying to define the extent to which artificial neural networks can exhibit systematicity suggest that systematicity can be achieved by connectionist models but not by default. Here we present a novel connectionist model of sentence production that employs rich situation model representations originally proposed for modeling systematicity in comprehension. The high performance of our model demonstrates that such representations are also well suited to model language production. Furthermore, the model can produce multiple novel sentences for previously unseen situations, including in a different voice (actives vs. passive) and with words in new syntactic roles, thus demonstrating semantic and syntactic generalization and arguably systematicity. Our results provide yet further evidence that such connectionist approaches can achieve systematicity, in production as well as comprehension. We propose our positive results to be a consequence of the regularities of the microworld from which the semantic representations are derived, which provides a sufficient structure from which the neural network can interpret novel inputs.


1996 ◽  
Author(s):  
Bernard Engel ◽  
Yael Edan ◽  
James Simon ◽  
Hanoch Pasternak ◽  
Shimon Edelman

The objectives of this project were to develop procedures and models, based on neural networks, for quality sorting of agricultural produce. Two research teams, one in Purdue University and the other in Israel, coordinated their research efforts on different aspects of each objective utilizing both melons and tomatoes as case studies. At Purdue: An expert system was developed to measure variances in human grading. Data were acquired from eight sensors: vision, two firmness sensors (destructive and nondestructive), chlorophyll from fluorescence, color sensor, electronic sniffer for odor detection, refractometer and a scale (mass). Data were analyzed and provided input for five classification models. Chlorophyll from fluorescence was found to give the best estimation for ripeness stage while the combination of machine vision and firmness from impact performed best for quality sorting. A new algorithm was developed to estimate and minimize training size for supervised classification. A new criteria was established to choose a training set such that a recurrent auto-associative memory neural network is stabilized. Moreover, this method provides for rapid and accurate updating of the classifier over growing seasons, production environments and cultivars. Different classification approaches (parametric and non-parametric) for grading were examined. Statistical methods were found to be as accurate as neural networks in grading. Classification models by voting did not enhance the classification significantly. A hybrid model that incorporated heuristic rules and either a numerical classifier or neural network was found to be superior in classification accuracy with half the required processing of solely the numerical classifier or neural network. In Israel: A multi-sensing approach utilizing non-destructive sensors was developed. Shape, color, stem identification, surface defects and bruises were measured using a color image processing system. Flavor parameters (sugar, acidity, volatiles) and ripeness were measured using a near-infrared system and an electronic sniffer. Mechanical properties were measured using three sensors: drop impact, resonance frequency and cyclic deformation. Classification algorithms for quality sorting of fruit based on multi-sensory data were developed and implemented. The algorithms included a dynamic artificial neural network, a back propagation neural network and multiple linear regression. Results indicated that classification based on multiple sensors may be applied in real-time sorting and can improve overall classification. Advanced image processing algorithms were developed for shape determination, bruise and stem identification and general color and color homogeneity. An unsupervised method was developed to extract necessary vision features. The primary advantage of the algorithms developed is their ability to learn to determine the visual quality of almost any fruit or vegetable with no need for specific modification and no a-priori knowledge. Moreover, since there is no assumption as to the type of blemish to be characterized, the algorithm is capable of distinguishing between stems and bruises. This enables sorting of fruit without knowing the fruits' orientation. A new algorithm for on-line clustering of data was developed. The algorithm's adaptability is designed to overcome some of the difficulties encountered when incrementally clustering sparse data and preserves information even with memory constraints. Large quantities of data (many images) of high dimensionality (due to multiple sensors) and new information arriving incrementally (a function of the temporal dynamics of any natural process) can now be processed. Furhermore, since the learning is done on-line, it can be implemented in real-time. The methodology developed was tested to determine external quality of tomatoes based on visual information. An improved model for color sorting which is stable and does not require recalibration for each season was developed for color determination. Excellent classification results were obtained for both color and firmness classification. Results indicted that maturity classification can be obtained using a drop-impact and a vision sensor in order to predict the storability and marketing of harvested fruits. In conclusion: We have been able to define quantitatively the critical parameters in the quality sorting and grading of both fresh market cantaloupes and tomatoes. We have been able to accomplish this using nondestructive measurements and in a manner consistent with expert human grading and in accordance with market acceptance. This research constructed and used large databases of both commodities, for comparative evaluation and optimization of expert system, statistical and/or neural network models. The models developed in this research were successfully tested, and should be applicable to a wide range of other fruits and vegetables. These findings are valuable for the development of on-line grading and sorting of agricultural produce through the incorporation of multiple measurement inputs that rapidly define quality in an automated manner, and in a manner consistent with the human graders and inspectors.


2021 ◽  
Vol 09 (07) ◽  
pp. E1136-E1144
Author(s):  
Astrid de Maissin ◽  
Remi Vallée ◽  
Mathurin Flamant ◽  
Marie Fondain-Bossiere ◽  
Catherine Le Berre ◽  
...  

Abstract Background and study aims Computer-aided diagnostic tools using deep neural networks are efficient for detection of lesions in endoscopy but require a huge number of images. The impact of the quality of annotation has not been tested yet. Here we describe a multi-expert annotated dataset of images extracted from capsules from Crohn’s disease patients and the impact of the quality of annotations on the accuracy of a recurrent attention neural network. Methods Images of capsule were annotated by a reader first and then reviewed by three experts in inflammatory bowel disease. Concordance analysis between experts was evaluated by Fleiss’ kappa and all the discordant images were, again, read by all the endoscopists to obtain a consensus annotation. A recurrent attention neural network developed for the study was tested before and after the consensus annotation. Available neural networks (ResNet and VGGNet) were also tested under the same conditions. Results The final dataset included 3498 images with 2124 non-pathological (60.7 %), 1360 pathological (38.9 %), and 14 (0.4 %) inconclusive. Agreement of the experts was good for distinguishing pathological and non-pathological images with a kappa of 0.79 (P < 0.0001). The accuracy of our classifier and the available neural networks increased after the consensus annotation with a precision of 93.7 %, sensitivity of 93 %, and specificity of 95 %. Conclusions The accuracy of the neural network increased with improved annotations, suggesting that the number of images needed for the development of these systems could be diminished using a well-designed dataset.


2020 ◽  
pp. 15-21
Author(s):  
R. N. Kvetny ◽  
R. V. Masliy ◽  
A. M. Kyrylenko ◽  
V. V. Shcherba

The article is devoted to the study of object detection in ima­ges using neural networks. The structure of convolutional neural networks used for image processing is considered. The formation of the convolutional layer (Fig. 1), the sub-sampling layer (Fig. 2) and the fully connected layer (Fig. 3) are described in detail. An overview of popular high-performance convolutional neural network architectures used to detect R-FCN, Yolo, Faster R-CNN, SSD, DetectNet objects has been made. The basic stages of image processing by the DetectNet neural network, which is designed to detect objects in images, are discussed. NVIDIA DIGITS was used to create and train models, and several DetectNet models were trained using this environment. The parameters of experiments (Table 1) and the compari­son of the quality of the trained models (Table 2) are presented. As training and validation data, we used an image of the KITTI database, which was created to improve self-driving systems that do not go without built-in devices, one of which could be the Jetson TX2. KITTI’s images feature several object classes, including cars and pedestrians. Model training and testing was performed using a Jetson TX2 supercomputer. Five models were trained that differed in the Base learning rate parameter. The results obtained make it possible to find a compromise value for the Base learning rate para­meter to quickly obtain a model with a high mAP value. The qua­lity of the best model obtained on the KITTI validation dataset is mAP = 57.8%.


2021 ◽  
pp. 221-227
Author(s):  
Asif Mohammad ◽  
Mahruf Zaman Utso ◽  
Shifat Bin Habib ◽  
Amit Kumar Das

Neural networks in image processing are becoming a more crucial and integral part of machine learning as computational technology and hardware systems are advanced. Deep learning is also getting attention from the medical sector as it is a prominent process for classifying diseases.  There is a lot of research to predict retinal diseases using deep learning algorithms like Convolutional Neural Network (CNN). Still, there are not many researches for predicting diseases like CNV which stands for choroidal neovascularization, DME, which stands for Diabetic Macular Edema; and DRUSEN. In our research paper, the CNN (Convolutional Neural Networks) algorithm labeled the dataset of OCT retinal images into four types: CNV, DME, DRUSEN, and Natural Retina. We have also done several preprocessing on the images before passing these to the neural network. We have implemented different models for our algorithm where individual models have different hidden layers.  At the end of our following research, we have found that our algorithm CNN generates 93% accuracy.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Rui Liu

In this paper, we propose a multiresidual module convolutional neural network-based method for athlete pose estimation in sports game videos. The network firstly designs an improved residual module based on the traditional residual module. Firstly, a large perceptual field residual module is designed to learn the correlation between the athlete components in the sports game video within a large perceptual field. A multiscale residual module is designed in the paper to better solve the inaccuracy of the pose estimation due to the problem of scale change of the athlete components in the sports game video. Secondly, these three residual modules are used as the building blocks of the convolutional neural network. When the resolution is high, the large perceptual field residual module and the multiscale residual module are used to capture information in a larger range as well as at each scale, and when the resolution is low, only the improved residual module is used. Finally, four multiresidual module convolutional neural networks are used to form the final multiresidual module stacked convolutional neural network. The neural network model proposed in this paper achieves high accuracy of 89.5% and 88.2% on the upper arm and lower arm, respectively, so the method in this paper reduces the influence of occlusion on the athlete’s posture estimation to a certain extent. Through the experiments, it can be seen that the proposed multiresidual module stacked convolutional neural network-based method for athlete pose estimation in sports game videos further improves the accuracy of athlete pose estimation in sports game videos.


Sign in / Sign up

Export Citation Format

Share Document