scholarly journals Filter Pruning via Measuring Feature Map Information

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6601
Author(s):  
Linsong Shao ◽  
Haorui Zuo ◽  
Jianlin Zhang ◽  
Zhiyong Xu ◽  
Jinzhen Yao ◽  
...  

Neural network pruning, an important method to reduce the computational complexity of deep models, can be well applied to devices with limited resources. However, most current methods focus on some kind of information about the filter itself to prune the network, rarely exploring the relationship between the feature maps and the filters. In this paper, two novel pruning methods are proposed. First, a new pruning method is proposed, which reflects the importance of filters by exploring the information in the feature maps. Based on the premise that the more information there is, more important the feature map is, the information entropy of feature maps is used to measure information, which is used to evaluate the importance of each filter in the current layer. Further, normalization is used to realize cross layer comparison. As a result, based on the method mentioned above, the network structure is efficiently pruned while its performance is well reserved. Second, we proposed a parallel pruning method using the combination of our pruning method above and slimming pruning method which has better results in terms of computational cost. Our methods perform better in terms of accuracy, parameters, and FLOPs compared to most advanced methods. On ImageNet, it is achieved 72.02% top1 accuracy for ResNet50 with merely 11.41 M parameters and 1.12 B FLOPs.For DenseNet40, it is obtained 94.04% accuracy with only 0.38M parameters and 110.72M FLOPs on CIFAR10, and our parallel pruning method makes the parameters and FLOPs are just 0.37M and 100.12M, respectively, with little loss of accuracy.

Author(s):  
Hang Li ◽  
Chen Ma ◽  
Wei Xu ◽  
Xue Liu

Building compact convolutional neural networks (CNNs) with reliable performance is a critical but challenging task, especially when deploying them in real-world applications. As a common approach to reduce the size of CNNs, pruning methods delete part of the CNN filters according to some metrics such as l1-norm. However, previous methods hardly leverage the information variance in a single feature map and the similarity characteristics among feature maps. In this paper, we propose a novel filter pruning method, which incorporates two kinds of feature map selections: diversity-aware selection (DFS) and similarity-aware selection (SFS). DFS aims to discover features with low information diversity while SFS removes features that have high similarities with others. We conduct extensive empirical experiments with various CNN architectures on publicly available datasets. The experimental results demonstrate that our model obtains up to 91.6% parameter decrease and 83.7% FLOPs reduction with almost no accuracy loss.


2020 ◽  
Vol 39 (5) ◽  
pp. 7403-7410
Author(s):  
Yangke Huang ◽  
Zhiming Wang

Network pruning has been widely used to reduce the high computational cost of deep convolutional neural networks(CNNs). The dominant pruning methods, channel pruning, removes filters in layers based on their importance or sparsity training. But these methods often give limited acceleration ratio and encounter difficulties when pruning CNNs with skip connections. Block pruning methods take a sequence of consecutive layers (e.g., Conv-BN-ReLu) as a block and remove entire block each time. However, previous methods usually introduce new parameters to help pruning and lead additional parameters and extra computations. This work proposes a novel multi-granularity pruning approach that combines block pruning with channel pruning (BPCP). The block pruning (BP) module remove blocks by directly searches the redundant blocks with gradient descent and leaves no extra parameters in final models, which is friendly to hardware optimization. The channel pruning (CP) module remove redundant channels based on importance criteria and handles CNNs with skip connections properly, which further improves the overall compression ratio. As a result, for CIFAR10, BPCP reduces the number of parameters and MACs of a ResNet56 model up to 78.9% and 80.3% respectively with <3% accuracy drop. In terms of speed, it gives a 3.17 acceleration ratio. Our code has been made available at https://github.com/Pokemon-Huang/BPCP.


Author(s):  
Min Xia ◽  
Wenzhu Song ◽  
Xudong Sun ◽  
Jia Liu ◽  
Tao Ye ◽  
...  

A weighted densely connected convolution network (W-DenseNet) is proposed for reinforcement learning in this work. The W-DenseNet can maximize the information flow between all layers in the network by cross layer connection, which can reduce the phenomenon of gradient vanishing and degradation, and greatly improves the speed of training convergence. The weight coefficient introduced in W-DenseNet, the current layer received all the previous layers’ feature maps with different initial weights, which can extract feature information of different layers more effectively according to tasks. According to the weight adjusted by learning, the cross-layer connection is pruned to remove the cross-layer connection with smaller weight, so as to reduce the number of cross-layer. In this work, GridWorld and FlappyBird games are used for simulation. The simulation results of deep reinforcement learning based on W-DenseNet are compared with the traditional deep reinforcement learning algorithm and reinforcement learning algorithm based on DenseNet. The simulation results show that the proposed W-DenseNet method can make the results more convergent, reduce the training time, and obtain more stable results.


2019 ◽  
Vol 118 (8) ◽  
pp. 347-355
Author(s):  
Hye- RimPark ◽  
Yen-Yoo You

Unlike non-profit organizations, social enterprises must be sustainable through profit-making activities in order to pursue social purposes.However, the most important of the poor limited resources is also human resources, and for the efficient use of human resources, empowerment should be given to members. This study proves whether job engagement mediates the effect on sustainability when psychological empowerment is given to employees in social enterprises.


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 348
Author(s):  
Choongsang Cho ◽  
Young Han Lee ◽  
Jongyoul Park ◽  
Sangkeun Lee

Semantic image segmentation has a wide range of applications. When it comes to medical image segmentation, its accuracy is even more important than those of other areas because the performance gives useful information directly applicable to disease diagnosis, surgical planning, and history monitoring. The state-of-the-art models in medical image segmentation are variants of encoder-decoder architecture, which is called U-Net. To effectively reflect the spatial features in feature maps in encoder-decoder architecture, we propose a spatially adaptive weighting scheme for medical image segmentation. Specifically, the spatial feature is estimated from the feature maps, and the learned weighting parameters are obtained from the computed map, since segmentation results are predicted from the feature map through a convolutional layer. Especially in the proposed networks, the convolutional block for extracting the feature map is replaced with the widely used convolutional frameworks: VGG, ResNet, and Bottleneck Resent structures. In addition, a bilinear up-sampling method replaces the up-convolutional layer to increase the resolution of the feature map. For the performance evaluation of the proposed architecture, we used three data sets covering different medical imaging modalities. Experimental results show that the network with the proposed self-spatial adaptive weighting block based on the ResNet framework gave the highest IoU and DICE scores in the three tasks compared to other methods. In particular, the segmentation network combining the proposed self-spatially adaptive block and ResNet framework recorded the highest 3.01% and 2.89% improvements in IoU and DICE scores, respectively, in the Nerve data set. Therefore, we believe that the proposed scheme can be a useful tool for image segmentation tasks based on the encoder-decoder architecture.


2021 ◽  
Author(s):  
Jiří Mihola

The monograph develops the theory of production functions and their systematic typology. It looks at the relationship between inputs and outputs as a universal relationship that is used not only in economics but also in other disciplines. In addition to the static production function, special attention is paid to the dynamization of individual quantities and the issue of expressing the effect of changes in these quantities on the change in production. It is explained why in the aggregate production function expressed through aggregate factor input and aggregate factor productivity it is necessary to use a multiplicative relationship, why the multiplicative link is also suitable in terms of total input factor and why the share of weights in labor and capital should be the same. The use of the production function is demonstrated on the development of the economies of the USA, China and India and on the ten largest economies of the world in terms of absolute GDP, on cryptocurrencies and on the so-called farming role.In addition to a comprehensive overview of production functions, the monograph also enriches new ideas that arose during long-term computational and analytical activities of economic and business. Particularly innovative is the generalization of the production function to any system with variable inputs and outputs. The production function can thus be recognized in many identities. The original intention of the research was to examine the intensity of economic development, but it turned out that it is closely related to production functions. The impetus for this research comes from Prof. Ing. František Brabec, DrSc. a genius mathematician, designer, economist and manager, former general director of Škoda in Pilsen and later rector of ČVÚT.The presented typology of production functions is not limited to one area of economics, but goes beyond it. The monograph respects the definition of the static production function as the maximum amount of production that can be produced with a given number of production factors. On this function, which can be effectively displayed using polynomial functions of different orders,significant points can be systematically defined, ie the inflection point, the point of maximum efficiency, the point of maximum profit and the point of maximum production. The purpose is to optimize the number of inserted production factors. The text is preferred the point with the greatest effectiveness. If this quantity does not correspond, for example, to demand, it is possible to choose another technology, which will be reflected in a shift in the static production function. At the same time, the important points of these functions describe the trajectory, which has the nature of a dynamic production function. For a dynamic production function, the crucial question is how the change in individual factors contributes to the overall change in output. If the production function is expressed through inputs and their efficiency, dynamic parameters of extensibility and intensity can be defined, which exactly express the effect of changes in inputs and the effect of changes in efficiency on changes in outputs for all possible situations. Special attention is paid to the aggregate production function. It explains why it should be expressed as the product of the aggregate input factor (TIF) and aggregate factor productivity (TFP), or why the term TIF should be expressed as a weighted product of labor and capital, in which the value of labor and capital weights could be and identical. The monograph here surpasses the traditional additive view of the multi-factor production function by proposing a multiplicative link, which also allows the derivation of growth accounting, but with a new interpretation of weights and (1-), which do not need to be calculated for each subject and each year.The time production function is used to forecast the GDP development of the US, China and India economies until 2030 and 2050, respectively. It is also predicted an increase in the absolute GDP of Indonesia, a stable position of Russia and the loss of the elite position of Japan and Germany.The monograph also deals with the hitherto unresolved question of whether, even in economics, it is also necessary in certain circumstances to take into account a phenomenon called quantization in physics. It turns out that quantization is a common thing in economics, which is documented on specific forms of production functions that respect quantization in economics.The monograph also deals with the relationship between the efficiency of an individual given the use of a certain point on a specific static production function and common efficiency, ie all actors together. These examples assume limited resources. The sum of the outputs of all actors depends on how the actors share these limited resources. It can be expected that there will be at least one method of distribution that will bring the highest sum of outputs (products, crops) of all actors. This result, however, also depends on the shape of the production functions. This is investigated using EDM, i.e.elementary distribution models. EDM for polynomial production functions of the 2nd to 5th order are not yet published in summary. Of the new findings, they are the most interesting. When using two polynomial production functions, the EDM boundary becomes linear if the inflection point is used for both production functions. If we are above the inflection point, the EDM is properly concave. It turned out that the "bending" of the production function in the region of the inflection point can be modeled using a quantity of the order of the respective polynomial. The higher the order of the polynomial, the higher the deflection can be achieved. This proved to be a very important finding in modeling specific production functions. This effect cannot be achieved by combining other parameters.


Author(s):  
Yanteng Zhang ◽  
Qizhi Teng ◽  
Linbo Qing ◽  
Yan Liu ◽  
Xiaohai He

Alzheimer’s disease (AD) is a degenerative brain disease and the most common cause of dementia. In recent years, with the widespread application of artificial intelligence in the medical field, various deep learning-based methods have been applied for AD detection using sMRI images. Many of these networks achieved AD vs HC (Healthy Control) classification accuracy of up to 90%but with a large number of computational parameters and floating point operations (FLOPs). In this paper, we adopt a novel ghost module, which uses a series of cheap operations of linear transformation to generate more feature maps, embedded into our designed ResNet architecture for task of AD vs HC classification. According to experiments on the OASIS dataset, our lightweight network achieves an optimistic accuracy of 97.92%and its total parameters are dozens of times smaller than state-of-the-art deep learning networks. Our proposed AD classification network achieves better performance while the computational cost is reduced significantly.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2547 ◽  
Author(s):  
Wenxin Dai ◽  
Yuqing Mao ◽  
Rongao Yuan ◽  
Yijing Liu ◽  
Xuemei Pu ◽  
...  

Convolution neural network (CNN)-based detectors have shown great performance on ship detections of synthetic aperture radar (SAR) images. However, the performance of current models has not been satisfactory enough for detecting multiscale ships and small-size ones in front of complex backgrounds. To address the problem, we propose a novel SAR ship detector based on CNN, which consist of three subnetworks: the Fusion Feature Extractor Network (FFEN), Region Proposal Network (RPN), and Refine Detection Network (RDN). Instead of using a single feature map, we fuse feature maps in bottom–up and top–down ways and generate proposals from each fused feature map in FFEN. Furthermore, we further merge features generated by the region-of-interest (RoI) pooling layer in RDN. Based on the feature representation strategy, the CNN framework constructed can significantly enhance the location and semantics information for the multiscale ships, in particular for the small ships. On the other hand, the residual block is introduced to increase the network depth, through which the detection precision could be further improved. The public SAR ship dataset (SSDD) and China Gaofen-3 satellite SAR image are used to validate the proposed method. Our method shows excellent performance for detecting the multiscale and small-size ships with respect to some competitive models and exhibits high potential in practical application.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bangtong Huang ◽  
Hongquan Zhang ◽  
Zihong Chen ◽  
Lingling Li ◽  
Lihua Shi

Deep learning algorithms are facing the limitation in virtual reality application due to the cost of memory, computation, and real-time computation problem. Models with rigorous performance might suffer from enormous parameters and large-scale structure, and it would be hard to replant them onto embedded devices. In this paper, with the inspiration of GhostNet, we proposed an efficient structure ShuffleGhost to make use of the redundancy in feature maps to alleviate the cost of computations, as well as tackling some drawbacks of GhostNet. Since GhostNet suffers from high computation of convolution in Ghost module and shortcut, the restriction of downsampling would make it more difficult to apply Ghost module and Ghost bottleneck to other backbone. This paper proposes three new kinds of ShuffleGhost structure to tackle the drawbacks of GhostNet. The ShuffleGhost module and ShuffleGhost bottlenecks are utilized by the shuffle layer and group convolution from ShuffleNet, and they are designed to redistribute the feature maps concatenated from Ghost Feature Map and Primary Feature Map. Besides, they eliminate the gap of them and extract the features. Then, SENet layer is adopted to reduce the computation cost of group convolution, as well as evaluating the importance of the feature maps which concatenated from Ghost Feature Maps and Primary Feature Maps and giving proper weights for the feature maps. This paper conducted some experiments and proved that the ShuffleGhostV3 has smaller trainable parameters and FLOPs with the ensurance of accuracy. And with proper design, it could be more efficient in both GPU and CPU side.


2018 ◽  
Vol 23 (4) ◽  
pp. 426-447 ◽  
Author(s):  
Megan Wainwright

Technologies for medicinal oxygen delivery at home are increasingly part of the global health technology landscape in the face of rising rates of chronic lung and heart diseases. From the mere notion of harvesting and privatizing oxygen from the atmosphere to its status as both dangerous and therapeutic, and finally to its capacity to both extend and limit life, oxygen as therapy materializes its status as an ambivalent object in global health. This analysis of ethnographic material from Uruguay and South Africa on the experience of home oxygen therapy is guided by philosopher Don Ihde’s postphenomenology – a pragmatic philosophical approach for analysing the relationships between humans and technologies. Participants related to their oxygen devices as limiting-enablers, as markers of illness and measures of recovery, and as precious and limited resources. Oxygen was materialized in many forms, each with their own characteristics shaping the ‘amplification/reduction’ character of the relationship as well as the degree to which the devices became ‘transparent’ to their users. Ihde’s four types of human–technology relations – embodiment, hermeneutic, alterity and background relations – are at play in the multistability of oxygen. Importantly, the lack of technological ‘transparency’, in Ihde’s sense of the term, reflects not only the materiality of oxygen but inequality too. While postphenomenology adds a productive material and technological flavour to phenomenology, the author argues that a critical postphenomenology is needed to engage with the political-economy of human–oxygen technology relations.


Sign in / Sign up

Export Citation Format

Share Document