scholarly journals Optimizing Convolution Neural Network on the TI C6678 multicore DSP

2018 ◽  
Vol 246 ◽  
pp. 03044 ◽  
Author(s):  
Guozhao Zeng ◽  
Xiao Hu ◽  
Yueyue Chen

Convolutional Neural Networks (CNNs) have become the most advanced algorithms for deep learning. They are widely used in image processing, object detection and automatic translation. As the demand for CNNs continues to increase, the platforms on which they are deployed continue to expand. As an excellent low-power, high-performance, embedded solution, Digital Signal Processor (DSP) is used frequently in many key areas. This paper attempts to deploy the CNN to Texas Instruments (TI)’s TMS320C6678 multi-core DSP and optimize the main operations (convolution) to accommodate the DSP structure. The efficiency of the improved convolution operation has increased by tens of times.

2021 ◽  
Vol 26 (1) ◽  
pp. 200-215
Author(s):  
Muhammad Alam ◽  
Jian-Feng Wang ◽  
Cong Guangpei ◽  
LV Yunrong ◽  
Yuanfang Chen

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.


2020 ◽  
Vol 11 (28) ◽  
pp. 7335-7348 ◽  
Author(s):  
Timothy E. H. Allen ◽  
Andrew J. Wedlake ◽  
Elena Gelžinytė ◽  
Charles Gong ◽  
Jonathan M. Goodman ◽  
...  

Deep learning neural networks, constructed for the prediction of chemical binding at 79 pharmacologically important human biological targets, show extremely high performance on test data (accuracy 92.2 ± 4.2%, MCC 0.814 ± 0.093, ROC-AUC 0.96 ± 0.04).


Author(s):  
Rasmita Lenka ◽  
Koustav Dutta ◽  
Ashimananda Khandual ◽  
Soumya Ranjan Nayak

The chapter focuses on application of digital image processing and deep learning for analyzing the occurrence of malaria from the medical reports. This approach is helpful in quick identification of the disease from the preliminary tests which are carried out in a person affected by malaria. The combination of deep learning has made the process much advanced as the convolutional neural network is able to gain deeper insights from the medical images of the person. Since traditional methods are not able to detect malaria properly and quickly, by means of convolutional neural networks, the early detection of malaria has been possible, and thus, this process will open a new door in the world of medical science.


Author(s):  
S O Stepanenko ◽  
P Y Yakimov

Object classification with use of neural networks is extremely current today. YOLO is one of the most often used frameworks for object classification. It produces high accuracy but the processing speed is not high enough especially in conditions of limited performance of a computer. This article researches use of a framework called NVIDIA TensorRT to optimize YOLO with the aim of increasing the image processing speed. Saving efficiency and quality of the neural network work TensorRT allows us to increase the processing speed using an optimization of the architecture and an optimization of calculations on a GPU.


2019 ◽  
Vol 8 (3) ◽  
pp. 6873-6880

Palm leaf manuscripts has been one of the ancient writing methods but the palm leaf manuscripts content requires to be inscribed in a new set of leaves. This study has provided a solution to save the contents in palm leaf manuscripts by recognizing the handwritten Tamil characters in manuscripts and storing them digitally. Character recognition is one of the most essential fields of pattern recognition and image processing. Generally Optical character recognition is the method of e-translation of typewritten text or handwritten images into machine editable text. The handwritten Tamil character recognition has been one of the challenging and active areas of research in the field of pattern recognition and image processing. In this study a trial was made to identify Tamil handwritten characters without extraction of feature using convolutional neural networks. This study uses convolutional neural networks for recognizing and classifying the Tamil palm leaf manuscripts of characters from separated character images. The convolutional neural network is a deep learning approach for which it does not need to retrieve features and also a rapid approach for character recognition. In the proposed system every character is expanded to needed pixels. The expanded characters have predetermined pixels and these pixels are considered as characteristics for neural network training. The trained network is employed for recognition and classification. Convolutional Network Model development contains convolution layer, Relu layer, pooling layer, fully connected layer. The ancient Tamil character dataset of 60 varying class has been created. The outputs reveal that the proposed approach generates better rates of recognition than that of schemes based on feature extraction for handwritten character recognition. The accuracy of the proposed approach has been identified as 97% which shows that the proposed approach is effective in terms of recognition of ancient characters.


2021 ◽  
Author(s):  
Bo Wang ◽  
Eric R Gamazon

Alzheimer's Disease (AD) is a debilitating form of dementia with a high prevalence in the global population and a large burden on the community and health care systems. AD's complex pathobiology consists of extracellular β-amyloid deposition and intracellular hyperphosphorylated tau. Comprehensive mutational analyses can generate a wealth of knowledge about protein properties and enable crucial insights into molecular mechanisms of disease. Deep Mutational Scanning (DMS) has enabled multiplexed measurement of mutational effects on protein properties, including kinematics and self-organization, with unprecedented resolution. However, potential bottlenecks of DMS characterization include experimental design, data quality, and the depth of mutational coverage. Here, we apply Deep Learning to comprehensively model the mutational effect of the AD-associated peptide Aβ42 on aggregation-related biochemical traits from DMS measurements. Among tested neural network architectures, Convolutional Neural Networks (ConvNets) and Recurrent Neural Networks (RNN) are found to be the most cost-effective models with robust high performance even under insufficiently-sampled DMS studies. While sequence features are essential for satisfactory prediction from neural networks, geometric-structural features further enhance the prediction performance. Notably, we demonstrate how mechanistic insights into phenotype may be extracted from the neural networks themselves suitably designed. This methodological benefit is particularly relevant for biochemical systems displaying a strong coupling between structure and phenotype such as the conformation of Aβ42 aggregate and nucleation, as shown here using a Graph Convolutional Neural Network (GCN) developed from the protein atomic structure input. In addition to accurate imputation of missing values (which ranged up to 55% of all phenotype values at key residues), the mutationally-defined nucleation phenotype generated from a GCN shows improved resolution for identifying known disease-causing mutations relative to the original DMS phenotype. Our study suggests that neural network derived sequence-phenotype mapping can be exploited not only to provide direct support for protein engineering or genome editing but also to facilitate therapeutic design with the gained perspectives from biological modeling.


2020 ◽  
pp. 15-21
Author(s):  
R. N. Kvetny ◽  
R. V. Masliy ◽  
A. M. Kyrylenko ◽  
V. V. Shcherba

The article is devoted to the study of object detection in ima­ges using neural networks. The structure of convolutional neural networks used for image processing is considered. The formation of the convolutional layer (Fig. 1), the sub-sampling layer (Fig. 2) and the fully connected layer (Fig. 3) are described in detail. An overview of popular high-performance convolutional neural network architectures used to detect R-FCN, Yolo, Faster R-CNN, SSD, DetectNet objects has been made. The basic stages of image processing by the DetectNet neural network, which is designed to detect objects in images, are discussed. NVIDIA DIGITS was used to create and train models, and several DetectNet models were trained using this environment. The parameters of experiments (Table 1) and the compari­son of the quality of the trained models (Table 2) are presented. As training and validation data, we used an image of the KITTI database, which was created to improve self-driving systems that do not go without built-in devices, one of which could be the Jetson TX2. KITTI’s images feature several object classes, including cars and pedestrians. Model training and testing was performed using a Jetson TX2 supercomputer. Five models were trained that differed in the Base learning rate parameter. The results obtained make it possible to find a compromise value for the Base learning rate para­meter to quickly obtain a model with a high mAP value. The qua­lity of the best model obtained on the KITTI validation dataset is mAP = 57.8%.


2014 ◽  
Vol 886 ◽  
pp. 556-559 ◽  
Author(s):  
Su Hua Chen ◽  
Zhi Meng Shu ◽  
Xu Fang

In order to improve high performance and low power of image processing embedded system, A high-efficient image processing embedded system which is based on the field programmable gate array and high-speed digital signal processor in this paper. In the whole system, A novel data transmission structure with a dual-port RAM which is divided into two halves, is applied to buff the high-speed real-time image data by Ping-pong technique. Because all work in the system is divided between the FPGA and DSP in the form of the pipelined, it is 25% higher than the processing system based on the single DSP in performance.


2021 ◽  
Vol 17 (2) ◽  
pp. 1-23
Author(s):  
Saman Biookaghazadeh ◽  
Pravin Kumar Ravi ◽  
Ming Zhao

High-throughput and low-latency Convolutional Neural Network (CNN) inference is increasingly important for many cloud- and edge-computing applications. FPGA-based acceleration of CNN inference has demonstrated various benefits compared to other high-performance devices such as GPGPUs. Current FPGA CNN-acceleration solutions are based on a single FPGA design, which are limited by the available resources on an FPGA. In addition, they can only accelerate conventional 2D neural networks. To address these limitations, we present a generic multi-FPGA solution, written in OpenCL, which can accelerate more complex CNNs (e.g., C3D CNN) and achieve a near linear speedup with respect to the available single-FPGA solutions. The design is built upon the Intel Deep Learning Accelerator architecture, with three extensions. First, it includes updates for better area efficiency (up to 25%) and higher performance (up to 24%). Second, it supports 3D convolutions for more challenging applications such as video learning. Third, it supports multi-FPGA communication for higher inference throughput. The results show that utilizing multiple FPGAs can linearly increase the overall bandwidth while maintaining the same end-to-end latency. In addition, the design can outperform other FPGA 2D accelerators by up to 8.4 times and 3D accelerators by up to 1.7 times.


2003 ◽  
Vol 12 (04) ◽  
pp. 505-518 ◽  
Author(s):  
NOBUAKI TAKAHASHI ◽  
TSUYOSHI OTAKE ◽  
MAMORU TANAKA

Recently a discrete-time cellular neural network (DT-CNN) is applied to many image processing applications such as compression and reconstruction, recognition and so on. Conventional image processing techniques such as the discrete cosine transformation (DCT) and wavelet transforms work as a simple filter and do not make good use of interpolative dynamics by the feedback A template, which is one of the significant characteristics of a cellular neural network (CNN). If CNN is applied to a filter by an only feedforward B template, one should make a model which consists of digital filters using high speed signal processing modules such as a high speed digital signal processor. This paper describes the nonlinear interpolative effect of the feedback A template, by showing the evaluation of image compression and reconstruction.


Sign in / Sign up

Export Citation Format

Share Document