DWM: A Decomposable Winograd Method for Convolution Acceleration

Winograd's minimal filtering algorithm has been widely used in Convolutional Neural Networks (CNNs) to reduce the number of multiplications for faster processing. However, it is only effective on convolutions with kernel size as 3x3 and stride as 1, because it suffers from significantly increased FLOPs and numerical accuracy problem for kernel size larger than 3x3 and fails on convolution with stride larger than 1. In this paper, we propose a novel Decomposable Winograd Method (DWM), which breaks through the limitation of original Winograd's minimal filtering algorithm to a wide and general convolutions. DWM decomposes kernels with large size or large stride to several small kernels with stride as 1 for further applying Winograd method, so that DWM can reduce the number of multiplications while keeping the numerical accuracy. It enables the fast exploring of larger kernel size and larger stride value in CNNs for high performance and accuracy and even the potential for new CNNs. Comparing against the original Winograd, the proposed DWM is able to support all kinds of convolutions with a speedup of ∼2, without affecting the numerical accuracy.

Download Full-text

High Performance Implementation of 3D Convolutional Neural Networks on a GPU

Computational Intelligence and Neuroscience ◽

10.1155/2017/8348671 ◽

2017 ◽

Vol 2017 ◽

pp. 1-8 ◽

Cited By ~ 8

Author(s):

Qiang Lan ◽

Zelong Wang ◽

Mei Wen ◽

Chunyuan Zhang ◽

Yijie Wang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Networks ◽

High Performance ◽

The Other ◽

Memory Requirement ◽

Video Classification ◽

Filtering Algorithm ◽

Speed Up ◽

The Cost

Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version.

Download Full-text

USE OF CONVOLUTIONAL NEURAL NETWORKS FOR X-RAY IMAGE ORIENTATION DETERMINATION

10.46793/iccbi21.263bs ◽

2021 ◽

Author(s):

Sandi Baressi Šegota ◽

◽

Simon Lysdahlgaard ◽

Søren Hess ◽

Ronald Antulov

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

High Performance ◽

Data Augmentation ◽

Classification Model ◽

Orientation Sensitivity ◽

X Ray ◽

Artificial Neural Network Ann ◽

Image Orientation ◽

Single Orientation

The fact that Artificial Intelligence (AI) based algorithms exhibit a high performance on image classification tasks has been shown many times. Still, certain issues exist with the application of machine learning (ML) artificial neural network (ANN) algorithms. The best known is the need for a large amount of statistically varied data, which can be addressed with expanded collection or data augmentation. Other issues are also present. Convolutional neural networks (CNNs) show extremely high performance on image-shaped data. Despite their performance, CNNs exhibit a large issue which is the sensitivity to image orientation. Previous research shows that varying the orientation of images may greatly lower the performance of the trained CNN. This is especially problematic in certain applications, such as X-ray radiography, an example of which is presented here. Previous research shows that the performance of CNNs is higher when used on images in a single orientation (left or right), as opposed to the combination of both. This means that the data needs to be differentiated before it enters the classification model. In this paper, the CNN-based model for differentiation between left and right-oriented images is presented. Multiple CNNs are trained and tested, with the highest performing being the VGG16 architecture which achieved an Accuracy of 0.99 (+/- 0.01), and an AUC of 0.98 (+/- 0.01). These results show that CNNs can be used to address the issue of orientation sensitivity by splitting the data in advance of being used in classification models.

Download Full-text

Analysis and Design of High Performance Deep Learning Algorithm: Convolutional Neural Networks

International Journal of Engineering Trends and Technology ◽

10.14445/22315381/ijett-v69i6p231 ◽

2021 ◽

Vol 69 (6) ◽

pp. 216-224

Author(s):

Sunil Pandey ◽

Naresh Kumar Nagwani ◽

Shrish Verma

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

High Performance ◽

Learning Algorithm ◽

Analysis And Design ◽

Deep Learning Algorithm

Download Full-text

A High-Performance Reconfigurable Accelerator for Convolutional Neural Networks

Proceedings of the 3rd International Conference on Multimedia Systems and Signal Processing - ICMSSP '18 ◽

10.1145/3220162.3220178 ◽

2018 ◽

Author(s):

Boya Zhao ◽

Jingqun Li ◽

Hongli Pan ◽

Mingjiang Wang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

High Performance

Download Full-text

Computing Solution for the Recognition of Basic Actions of Violence in Real Time, from the use of Convolutional Neural Networks, Video Sequences and High Performance Computing

2019 XLV Latin American Computing Conference (CLEI) ◽

10.1109/clei47609.2019.235100 ◽

2019 ◽

Author(s):

Almendra Prisila Laureano Lumba ◽

Roy Roger Rios Nunez ◽

Isaac Ocampo Yahuarcani ◽

Rodolfo Cardenas Vigo ◽

Carlos Alberto Garcia Cortegano ◽

...

Keyword(s):

Neural Networks ◽

High Performance Computing ◽

Real Time ◽

Convolutional Neural Networks ◽

High Performance ◽

Video Sequences ◽

Performance Computing

Download Full-text

A high performance FPGA-based accelerator for large-scale convolutional neural networks

2016 26th International Conference on Field Programmable Logic and Applications (FPL) ◽

10.1109/fpl.2016.7577308 ◽

2016 ◽

Cited By ~ 9

Author(s):

Huimin Li ◽

Xitian Fan ◽

Li Jiao ◽

Wei Cao ◽

Xuegong Zhou ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

High Performance ◽

Large Scale

Download Full-text

A Case Study of Quantizing Convolutional Neural Networks for Fast Disease Diagnosis on Portable Medical Devices

Sensors ◽

10.3390/s22010219 ◽

2021 ◽

Vol 22 (1) ◽

pp. 219

Author(s):

Mukhammed Garifulla ◽

Juncheol Shin ◽

Chanho Kim ◽

Won Hwa Kim ◽

Hye Jung Kim ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Medical Devices ◽

Energy Efficient ◽

High Performance ◽

Medical Image Analysis ◽

Disease Diagnosis ◽

Mobile System ◽

Performance Computing

Recently, the amount of attention paid towards convolutional neural networks (CNN) in medical image analysis has rapidly increased since they can analyze and classify images faster and more accurately than human abilities. As a result, CNNs are becoming more popular and play a role as a supplementary assistant for healthcare professionals. Using the CNN on portable medical devices can enable a handy and accurate disease diagnosis. Unfortunately, however, the CNNs require high-performance computing resources as they involve a significant amount of computation to process big data. Thus, they are limited to being used on portable medical devices with limited computing resources. This paper discusses the network quantization techniques that reduce the size of CNN models and enable fast CNN inference with an energy-efficient CNN accelerator integrated into recent mobile processors. With extensive experiments, we show that the quantization technique reduces inference time by 97% on the mobile system integrating a CNN acceleration engine.

Download Full-text

A review of the use of convolutional neural networks in agriculture

The Journal of Agricultural Science ◽

10.1017/s0021859618000436 ◽

2018 ◽

Vol 156 (3) ◽

pp. 312-322 ◽

Cited By ~ 37

Author(s):

A. Kamilaris ◽

F. X. Prenafeta-Boldú

Keyword(s):

Neural Networks ◽

Image Processing ◽

Convolutional Neural Networks ◽

High Performance ◽

Performance Metrics ◽

Specific Class ◽

Data Set ◽

Advantages And Disadvantages ◽

Future Potential ◽

Processing Techniques

AbstractDeep learning (DL) constitutes a modern technique for image processing, with large potential. Having been successfully applied in various areas, it has recently also entered the domain of agriculture. In the current paper, a survey was conducted of research efforts that employ convolutional neural networks (CNN), which constitute a specific class of DL, applied to various agricultural and food production challenges. The paper examines agricultural problems under study, models employed, sources of data used and the overall precision achieved according to the performance metrics used by the authors. Convolutional neural networks are compared with other existing techniques, and the advantages and disadvantages of using CNN in agriculture are listed. Moreover, the future potential of this technique is discussed, together with the authors’ personal experiences after employing CNN to approximate a problem of identifying missing vegetation from a sugar cane plantation in Costa Rica. The overall findings indicate that CNN constitutes a promising technique with high performance in terms of precision and classification accuracy, outperforming existing commonly used image-processing techniques. However, the success of each CNN model is highly dependent on the quality of the data set used.

Download Full-text

Direct micro-seismic event location and characterization from passive seismic data using convolutional neural networks

Geophysics ◽

10.1190/geo2020-0636.1 ◽

2021 ◽

pp. 1-77

Author(s):

Hanchen Wang ◽

Tariq Alkhalifah

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Event Detection ◽

Seismic Data ◽

High Performance ◽

Input Data ◽

Waveform Inversion ◽

Computational Cost ◽

Seismic Events

The ample size of time-lapse data often requires significant event detection and source location efforts, especially in areas like shale gas exploration regions where a large number of micro-seismic events are often recorded. In many cases, the real-time monitoring and locating of these events are essential to production decisions. Conventional methods face considerable drawbacks. For example, traveltime-based methods require traveltime picking of often noisy data, while migration and waveform inversion methods require expensive wavefield solutions and event detection. Both tasks require some human intervention, and this becomes a big problem when too many sources need to be located, which is common in micro-seismic monitoring. Machine learning has recently been used to identify micro-seismic events or locate their sources once they are identified and picked. We propose to use a novel artificial neural network framework to directly map seismic data, without any event picking or detection, to their potential source locations. We train two convolutional neural networks on labeled synthetic acoustic data containing simulated micro-seismic events to fulfill such requirements. One convolutional neural network, which has a global average pooling layer to reduce the computational cost while maintaining high-performance levels, aims to classify the number of events in the data. The other network predicts the source locations and other source features such as the source peak frequencies and amplitudes. To reduce the size of the input data to the network, we correlate the recorded traces with a central reference trace to allow the network to focus on the curvature of the input data near the zero-lag region. We train the networks to handle single, multi, and no event segments extracted from the data. Tests on a simple vertical varying model and a more realistic Otway field model demonstrate the approach's versatility and potential.

Download Full-text

Quality Assessment of Tire Shearography Images via Ensemble Hybrid Faster Region-Based ConvNets

Electronics ◽

10.3390/electronics9010045 ◽

2019 ◽

Vol 9 (1) ◽

pp. 45 ◽

Cited By ~ 2

Author(s):

Chuan-Yu Chang ◽

Kathiravan Srinivasan ◽

Wei-Chun Wang ◽

Ganapathy Pattukandan Ganapathy ◽

Durai Raj Vincent ◽

...

Keyword(s):

Neural Networks ◽

Quality Assessment ◽

Convolutional Neural Networks ◽

False Positive ◽

High Performance ◽

Learning Approaches ◽

Positive Ratio ◽

False Positive Ratio ◽

Proposed Model ◽

Human Eyes

In recent times, the application of enabling technologies such as digital shearography combined with deep learning approaches in the smart quality assessment of tires, which leads to intelligent tire manufacturing practices with automated defects detection. Digital shearography is a prominent approach that can be employed for identifying the defects in tires, usually not visible to human eyes. In this research, the bubble defects in tire shearography images are detected using a unique ensemble hybrid amalgamation of the convolutional neural networks/ConvNets with high-performance Faster Region-based convolutional neural networks. It can be noticed that the routine of region-proposal generation along with object detection is accomplished using the ConvNets. Primarily, the sliding window based ConvNets are utilized in the proposed model for dividing the input shearography images into regions, in order to identify the bubble defects. Subsequently, this is followed by implementing the Faster Region-based ConvNets for identifying the bubble defects in the tire shearography images and further, it also helps to minimize the false-positive ratio (sometimes referred to as the false alarm ratio). Moreover, it is evident from the experimental results that the proposed hybrid model offers a cent percent detection of bubble defects in the tire shearography images. Also, it can be witnessed that the false-positive ratio gets minimized to 18 percent.

Download Full-text