scholarly journals LdsConv: Learned Depthwise Separable Convolutions by Group Pruning

Sensors ◽  
2020 ◽  
Vol 20 (15) ◽  
pp. 4349
Author(s):  
Wenxiang Lin ◽  
Yan Ding ◽  
Hua-Liang Wei ◽  
Xinglin Pan ◽  
Yutong Zhang

Standard convolutional filters usually capture unnecessary overlap of features resulting in a waste of computational cost. In this paper, we aim to solve this problem by proposing a novel Learned Depthwise Separable Convolution (LdsConv) operation that is smart but has a strong capacity for learning. It integrates the pruning technique into the design of convolutional filters, formulated as a generic convolutional unit that can be used as a direct replacement of convolutions without any adjustments of the architecture. To show the effectiveness of the proposed method, experiments are carried out using the state-of-the-art convolutional neural networks (CNNs), including ResNet, DenseNet, SE-ResNet and MobileNet, respectively. The results show that by simply replacing the original convolution with LdsConv in these CNNs, it can achieve a significantly improved accuracy while reducing computational cost. For the case of ResNet50, the FLOPs can be reduced by 40.9%, meanwhile the accuracy on the associated ImageNet increases.

Sensors ◽  
2021 ◽  
Vol 21 (20) ◽  
pp. 6808
Author(s):  
Jianqiang Xiao ◽  
Dianbo Ma ◽  
Satoshi Yamane

Despite recent stereo matching algorithms achieving significant results on public benchmarks, the problem of requiring heavy computation remains unsolved. Most works focus on designing an architecture to reduce the computational complexity, while we take aim at optimizing 3D convolution kernels on the Pyramid Stereo Matching Network (PSMNet) for solving the problem. In this paper, we design a series of comparative experiments exploring the performance of well-known convolution kernels on PSMNet. Our model saves the computational complexity from 256.66G MAdd (Multiply-Add operations) to 69.03G MAdd (198.47G MAdd to 10.84G MAdd for only considering 3D convolutional neural networks) without losing accuracy. On Scene Flow and KITTI 2015 datasets, our model achieves results comparable to the state-of-the-art with a low computational cost.


2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Md Zahangir Alom ◽  
Paheding Sidike ◽  
Mahmudul Hasan ◽  
Tarek M. Taha ◽  
Vijayan K. Asari

In spite of advances in object recognition technology, handwritten Bangla character recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even many advanced existing methods do not lead to satisfactory performance in practice that related to HBCR. In this paper, a set of the state-of-the-art deep convolutional neural networks (DCNNs) is discussed and their performance on the application of HBCR is systematically evaluated. The main advantage of DCNN approaches is that they can extract discriminative features from raw data and represent them with a high degree of invariance to object distortions. The experimental results show the superior performance of DCNN models compared with the other popular object recognition approaches, which implies DCNN can be a good candidate for building an automatic HBCR system for practical applications.


2022 ◽  
pp. 1-10
Author(s):  
Daniel Trevino-Sanchez ◽  
Vicente Alarcon-Aquino

The need to detect and classify objects correctly is a constant challenge, being able to recognize them at different scales and scenarios, sometimes cropped or badly lit is not an easy task. Convolutional neural networks (CNN) have become a widely applied technique since they are completely trainable and suitable to extract features. However, the growing number of convolutional neural networks applications constantly pushes their accuracy improvement. Initially, those improvements involved the use of large datasets, augmentation techniques, and complex algorithms. These methods may have a high computational cost. Nevertheless, feature extraction is known to be the heart of the problem. As a result, other approaches combine different technologies to extract better features to improve the accuracy without the need of more powerful hardware resources. In this paper, we propose a hybrid pooling method that incorporates multiresolution analysis within the CNN layers to reduce the feature map size without losing details. To prevent relevant information from losing during the downsampling process an existing pooling method is combined with wavelet transform technique, keeping those details "alive" and enriching other stages of the CNN. Achieving better quality characteristics improves CNN accuracy. To validate this study, ten pooling methods, including the proposed model, are tested using four benchmark datasets. The results are compared with four of the evaluated methods, which are also considered as the state-of-the-art.


Author(s):  
Yang Yi ◽  
Feng Ni ◽  
Yuexin Ma ◽  
Xinge Zhu ◽  
Yuankai Qi ◽  
...  

State-of-the-art hand gesture recognition methods have investigated the spatiotemporal features based on 3D convolutional neural networks (3DCNNs) or convolutional long short-term memory (ConvLSTM). However, they often suffer from the inefficiency due to the high computational complexity of their network structures. In this paper, we focus instead on the 1D convolutional neural networks and propose a simple and efficient architectural unit, Multi-Kernel Temporal Block (MKTB), that models the multi-scale temporal responses by explicitly applying different temporal kernels. Then, we present a Global Refinement Block (GRB), which is an attention module for shaping the global temporal features based on the cross-channel similarity. By incorporating the MKTB and GRB, our architecture can effectively explore the spatiotemporal features within tolerable computational cost. Extensive experiments conducted on public datasets demonstrate that our proposed model achieves the state-of-the-art with higher efficiency. Moreover, the proposed MKTB and GRB are plug-and-play modules and the experiments on other tasks, like video understanding and video-based person re-identification, also display their good performance in efficiency and capability of generalization.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Vinicius Luiz Pacheco ◽  
Lucimara Bragagnolo ◽  
Antonio Thomé

Purpose The purpose of this article is to analyze the state-of-the art in a systematic way, identifying the main research groups and their related topics. The types of studies found are fundamental for understanding the application of artificial neural networks (ANNs) in cemented soils and the potential for using the technique, as well as the feasibility of extrapolation to new geotechnical or civil and environmental engineering segments. Design/methodology/approach This work is characterized as being bibliometric and systematic research of an exploratory perspective of state-of-the-art. It also persuades the qualitative and quantitative data analysis of cemented soil improvement, biocemented or microbially induced calcite precipitation (MICP) soil improvement by prediction/modeling by ANN. This study sought to compile and study the state of the art of the topic which possibilities to have a critical view about the theme. To do so, two main databases were analyzed: Scopus and Web of Science. Systematic review techniques, as well as bibliometric indicators, were implemented. Findings This paper connected the network between the achievements of the researches and illustrated the main application of ANNs in soil improvement prediction, specifically on cemented-based soils and biocemented soils (e.g. MICP technique). Also, as a bibliometric and systematic review, this work could achieve the key points in the absence of researches involving soil-ANN, and it provided the understanding of the lack of exploratory studies to be approached in the near future. Research limitations/implications Because of the research topic the article suggested other applications of ANNs in geotechnical engineering, such as other tests not related to geomechanical resistance such as unconfined compression test test and triaxial test. Practical implications This article systematically and critically presents some interesting points in the direction of future research, such as the non-approach to the use of ANNs in biocementation processes, such as MICP. Social implications Regarding the social environment, the paper brings approaches on methods that somehow mitigate the computational use, or elements necessary for geotechnical improvement of the soil, thereby optimizing the same consequently. Originality/value Neural networks have been studied for a long time in engineering, but the current computational power has increased the implementation for several engineering applications. Besides that, soil cementation is a widespread technique and its prediction modes often require high computational strength, such parameters can be mitigated with the use of ANNs, because artificial intelligence seeks learning from the implementation of the data set, reducing computational cost and increasing accuracy.


Author(s):  
Byungmin Ahn ◽  
Taewhan Kim

A new algorithm for extracting common kernels and convolutions to maximally eliminate the redundant operations among the convolutions in binary- and ternary-weight convolutional neural networks is presented. Precisely, we propose (1) a new algorithm of common kernel extraction to overcome the local and limited exploration of common kernel candidates by the existing method, and subsequently apply (2) a new concept of common convolution extraction to maximally eliminate the redundancy in the convolution operations. In addition, our algorithm is able to (3) tune in minimizing the number of resulting kernels for convolutions, thereby saving the total memory access latency for kernels. Experimental results on ternary-weight VGG-16 demonstrate that our convolution optimization algorithm is very effective, reducing the total number of operations for all convolutions by [Formula: see text], thereby reducing the total number of execution cycles on hardware platform by 22.4% while using [Formula: see text] fewer kernels over that of the convolution utilizing the common kernels extracted by the state-of-the-art algorithm.


Author(s):  
Dariush Salami ◽  
Saeedeh Momtazi

Abstract Deep neural networks have been widely used in various language processing tasks. Recurrent neural networks (RNNs) and convolutional neural networks (CNN) are two common types of neural networks that have a successful history in capturing temporal and spatial features of texts. By using RNN, we can encode input text to a lower space of semantic features while considering the sequential behavior of words. By using CNN, we can transfer the representation of input text to a flat structure to be used for classifying text. In this article, we proposed a novel recurrent CNN model to capture not only the temporal but also the spatial features of the input poem/verse to be used for poet identification. Considering the shortcomings of the normal RNNs, we try both long short-term memory and gated recurrent unit units in the proposed architecture and apply them to the poet identification task. There are a large number of poems in the history of literature whose poets are unknown. Considering the importance of the task in the information processing field, a great variety of methods from traditional learning models, such as support vector machine and logistic regression, to deep neural network models, such as CNN, have been proposed to address this problem. Our experiments show that the proposed model significantly outperforms the state-of-the-art models for poet identification by receiving either a poem or a single verse as input. In comparison to the state-of-the-art CNN model, we achieved 9% and 4% improvements in f-measure for poem- and verse-based tasks, respectively.


Author(s):  
Yang He ◽  
Guoliang Kang ◽  
Xuanyi Dong ◽  
Yanwei Fu ◽  
Yi Yang

This paper proposed a Soft Filter Pruning (SFP) method to accelerate the inference procedure of deep Convolutional Neural Networks (CNNs). Specifically, the proposed SFP enables the pruned filters to be updated when training the model after pruning. SFP has two advantages over previous works: (1) Larger model capacity. Updating previously pruned filters provides our approach with larger optimization space than fixing the filters to zero. Therefore, the network trained by our method has a larger model capacity to learn from the training data. (2) Less dependence on the pretrained model. Large capacity enables SFP to train from scratch and prune the model simultaneously. In contrast, previous filter pruning methods should be conducted on the basis of the pre-trained model to guarantee their performance. Empirically, SFP from scratch outperforms the previous filter pruning methods. Moreover, our approach has been demonstrated effective for many advanced CNN architectures. Notably, on ILSCRC-2012, SFP reduces more than 42% FLOPs on ResNet-101 with even 0.2% top-5 accuracy improvement, which has advanced the state-of-the-art. Code is publicly available on GitHub: https://github.com/he-y/softfilter-pruning


Author(s):  
Anish Mankotia and Meenu Garg

In this paper, we propose a novel semantic segmentation-based on the body pix module of the Tensor flow.js which can keep up with the accuracy of the state-of-the art approaches while running in real time. The solution follows the convolutional neural networks, each step in the workflow being enhanced by additional information from semantic segmentation. Therefore, we introduce several improvements to computation, aggregation, and optimization by adapting existing techniques to integrate additional surface information given by each semantic class. Using the body pix model which is trained using CNN, the ResNET50, this network can work with more than 150 layers, removing the problem of vanishing gradients. Using this network our body pix module, creates a more accurate and defined segmentation, and also supports multi-person segmentation.


Sign in / Sign up

Export Citation Format

Share Document