Pruning by leveraging training dynamics

AI Communications ◽

10.3233/aic-210127 ◽

2021 ◽

pp. 1-21

Author(s):

Andrei C. Apostol ◽

Maarten C. Stol ◽

Patrick Forré

Keyword(s):

State Of The Art ◽

Object Classification ◽

Compression Technique ◽

Pruning Method ◽

Model Compression ◽

Art Performance

We propose a novel pruning method which uses the oscillations around 0, i.e. sign flips, that a weight has undergone during training in order to determine its saliency. Our method can perform pruning before the network has converged, requires little tuning effort due to having good default values for its hyperparameters, and can directly target the level of sparsity desired by the user. Our experiments, performed on a variety of object classification architectures, show that it is competitive with existing methods and achieves state-of-the-art performance for levels of sparsity of 99.6 % and above for 2 out of 3 of the architectures tested. Moreover, we demonstrate that our method is compatible with quantization, another model compression technique. For reproducibility, we release our code at https://github.com/AndreiXYZ/flipout.

Download Full-text

PyConvU-Net: a lightweight and multiscale network for biomedical image segmentation

BMC Bioinformatics ◽

10.1186/s12859-020-03943-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Changyong Li ◽

Yongxian Fan ◽

Xiaodong Cai

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

State Of The Art ◽

Experimental Results ◽

Actual Situation ◽

Controlled Experiments ◽

Biomedical Image ◽

Segmentation Methods ◽

Art Performance

Abstract Background With the development of deep learning (DL), more and more methods based on deep learning are proposed and achieve state-of-the-art performance in biomedical image segmentation. However, these methods are usually complex and require the support of powerful computing resources. According to the actual situation, it is impractical that we use huge computing resources in clinical situations. Thus, it is significant to develop accurate DL based biomedical image segmentation methods which depend on resources-constraint computing. Results A lightweight and multiscale network called PyConvU-Net is proposed to potentially work with low-resources computing. Through strictly controlled experiments, PyConvU-Net predictions have a good performance on three biomedical image segmentation tasks with the fewest parameters. Conclusions Our experimental results preliminarily demonstrate the potential of proposed PyConvU-Net in biomedical image segmentation with resources-constraint computing.

Download Full-text

Contour Detection for Fibre of Preserved Szechuan Pickle Based on Dilated Convolution

Applied Sciences ◽

10.3390/app9132684 ◽

2019 ◽

Vol 9 (13) ◽

pp. 2684 ◽

Cited By ~ 1

Author(s):

Hongyang Li ◽

Lizhuang Liu ◽

Zhenqi Han ◽

Dan Zhao

Keyword(s):

Edge Detection ◽

State Of The Art ◽

Contour Detection ◽

Experimental Results ◽

Contour Method ◽

Class Differences ◽

Dilated Convolution ◽

The Mean ◽

Art Performance

Peeling fibre is an indispensable process in the production of preserved Szechuan pickle, the accuracy of which can significantly influence the quality of the products, and thus the contour method of fibre detection, as a core algorithm of the automatic peeling device, is studied. The fibre contour is a kind of non-salient contour, characterized by big intra-class differences and small inter-class differences, meaning that the feature of the contour is not discriminative. The method called dilated-holistically-nested edge detection (Dilated-HED) is proposed to detect the fibre contour, which is built based on the HED network and dilated convolution. The experimental results for our dataset show that the Pixel Accuracy (PA) is 99.52% and the Mean Intersection over Union (MIoU) is 49.99%, achieving state-of-the-art performance.

Download Full-text

Speedup of deep learning ensembles for semantic segmentation using a model compression technique

Computer Vision and Image Understanding ◽

10.1016/j.cviu.2017.05.004 ◽

2017 ◽

Vol 164 ◽

pp. 16-26 ◽

Cited By ~ 8

Author(s):

Andrew Holliday ◽

Mohammadamin Barekatain ◽

Johannes Laurmaa ◽

Chetak Kandaswamy ◽

Helmut Prendinger

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Compression Technique ◽

Model Compression ◽

Learning Ensembles

Download Full-text

A Graph-based Evolutionary Algorithm for Automated Machine Learning

10.37686/ser.v1i2.77 ◽

2020 ◽

Author(s):

Fei Qi ◽

Zhaohui Xia ◽

Gaoyang Tang ◽

Hang Yang ◽

Yu Song ◽

...

Keyword(s):

Machine Learning ◽

Evolutionary Algorithm ◽

Parameter Optimization ◽

State Of The Art ◽

The State ◽

Complex Structures ◽

Architecture Evolution ◽

Automated Machine Learning ◽

Art Performance

As an emerging field, Automated Machine Learning (AutoML) aims to reduce or eliminate manual operations that require expertise in machine learning. In this paper, a graph-based architecture is employed to represent flexible combinations of ML models, which provides a large searching space compared to tree-based and stacking-based architectures. Based on this, an evolutionary algorithm is proposed to search for the best architecture, where the mutation and heredity operators are the key for architecture evolution. With Bayesian hyper-parameter optimization, the proposed approach can automate the workflow of machine learning. On the PMLB dataset, the proposed approach shows the state-of-the-art performance compared with TPOT, Autostacker, and auto-sklearn. Some of the optimized models are with complex structures which are difficult to obtain in manual design.

Download Full-text

Optimization Learning: Perspective, Method, and Applications

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/728 ◽

2020 ◽

Author(s):

Risheng Liu

Keyword(s):

Inverse Problems ◽

Theoretical Analysis ◽

Iterative Methods ◽

State Of The Art ◽

Learning Paradigm ◽

Rigorous Analysis ◽

The Core ◽

Globally Convergent ◽

Ill Posed ◽

Art Performance

Numerous tasks at the core of statistics, learning, and vision areas are specific cases of ill-posed inverse problems. Recently, learning-based (e.g., deep) iterative methods have been empirically shown to be useful for these problems. Nevertheless, integrating learnable structures into iterations is still a laborious process, which can only be guided by intuitions or empirical insights. Moreover, there is a lack of rigorous analysis of the convergence behaviors of these reimplemented iterations, and thus the significance of such methods is a little bit vague. We move beyond these limits and propose a theoretically guaranteed optimization learning paradigm, a generic and provable paradigm for nonconvex inverse problems, and develop a series of convergent deep models. Our theoretical analysis reveals that the proposed optimization learning paradigm allows us to generate globally convergent trajectories for learning-based iterative methods. Thanks to the superiority of our framework, we achieve state-of-the-art performance on different real applications.

Download Full-text

Specializing Word Embeddings (for Parsing) by Information Bottleneck (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/658 ◽

2020 ◽

Author(s):

Xiang Lisa Li ◽

Jason Eisner

Keyword(s):

Dimensionality Reduction ◽

Semantic Information ◽

State Of The Art ◽

Word Embedding ◽

Discrete Version ◽

Word Embeddings ◽

Continuous Version ◽

Continuous Vector ◽

Information Bottleneck ◽

Art Performance

Pre-trained word embeddings like ELMo and BERT contain rich syntactic and semantic information, resulting in state-of-the-art performance on various tasks. We propose a very fast variational information bottleneck (VIB) method to nonlinearly compress these embeddings, keeping only the information that helps a discriminative parser. We compress each word embedding to either a discrete tag or a continuous vector. In the discrete version, our automatically compressed tags form an alternative tag set: we show experimentally that our tags capture most of the information in traditional POS tag annotations, but our tag sequences can be parsed more accurately at the same level of tag granularity. In the continuous version, we show experimentally that moderately compressing the word embeddings by our method yields a more accurate parser in 8 of 9 languages, unlike simple dimensionality reduction.

Download Full-text

Disentangled Feature Learning Network for Vehicle Re-Identification

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/66 ◽

2020 ◽

Author(s):

Yan Bai ◽

Yihang Lou ◽

Yongxing Dai ◽

Jun Liu ◽

Ziqian Chen ◽

...

Keyword(s):

State Of The Art ◽

Feature Learning ◽

Feature Representation ◽

Public Security ◽

The Public ◽

Common Features ◽

Learning Network ◽

Single Feature ◽

Art Performance

Vehicle Re-Identification (ReID) has attracted lots of research efforts due to its great significance to the public security. In vehicle ReID, we aim to learn features that are powerful in discriminating subtle differences between vehicles which are visually similar, and also robust against different orientations of the same vehicle. However, these two characteristics are hard to be encapsulated into a single feature representation simultaneously with unified supervision. Here we propose a Disentangled Feature Learning Network (DFLNet) to learn orientation specific and common features concurrently, which are discriminative at details and invariant to orientations, respectively. Moreover, to effectively use these two types of features for ReID, we further design a feature metric alignment scheme to ensure the consistency of the metric scales. The experiments show the effectiveness of our method that achieves state-of-the-art performance on three challenging datasets.

Download Full-text

Lossless text compression using GPT-2 language model and Huffman coding

SHS Web of Conferences ◽

10.1051/shsconf/202110204013 ◽

2021 ◽

Vol 102 ◽

pp. 04013

Author(s):

Md. Atiqur Rahman ◽

Mohamed Hamada

Keyword(s):

Data Compression ◽

State Of The Art ◽

Language Model ◽

Huffman Coding ◽

Original Text ◽

Text Compression ◽

Compression Technique ◽

Daily Life Activities ◽

Burrows Wheeler Transform ◽

Compressed Data

Modern daily life activities produced lots of information for the advancement of telecommunication. It is a challenging issue to store them on a digital device or transmit it over the Internet, leading to the necessity for data compression. Thus, research on data compression to solve the issue has become a topic of great interest to researchers. Moreover, the size of compressed data is generally smaller than its original. As a result, data compression saves storage and increases transmission speed. In this article, we propose a text compression technique using GPT-2 language model and Huffman coding. In this proposed method, Burrows-Wheeler transform and a list of keys are used to reduce the original text file’s length. Finally, we apply GPT-2 language mode and then Huffman coding for encoding. This proposed method is compared with the state-of-the-art techniques used for text compression. Finally, we show that the proposed method demonstrates a gain in compression ratio compared to the other state-of-the-art methods.

Download Full-text

An Experimental Analysis of Model Compression Techniques for Object Detection

10.5753/kdmile.2020.11958 ◽

2020 ◽

Author(s):

Andrey De Aguiar Salvi ◽

Rodrigo Coelho Barros

Keyword(s):

Object Detection ◽

Experimental Analysis ◽

State Of The Art ◽

Neural Architecture ◽

Model Compression ◽

Processing Power ◽

Benchmark Datasets ◽

The Difference ◽

And Performance ◽

Consumption Constraints

Recent research on Convolutional Neural Networks focuses on how to create models with a reduced number of parameters and a smaller storage size while keeping the model’s ability to perform its task, allowing the use of the best CNN for automating tasks in limited devices, with reduced processing power, memory, or energy consumption constraints. There are many different approaches in the literature: removing parameters, reduction of the floating-point precision, creating smaller models that mimic larger models, neural architecture search (NAS), etc. With all those possibilities, it is challenging to say which approach provides a better trade-off between model reduction and performance, due to the difference between the approaches, their respective models, the benchmark datasets, or variations in training details. Therefore, this article contributes to the literature by comparing three state-of-the-art model compression approaches to reduce a well-known convolutional approach for object detection, namely YOLOv3. Our experimental analysis shows that it is possible to create a reduced version of YOLOv3 with 90% fewer parameters and still outperform the original model by pruning parameters. We also create models that require only 0.43% of the original model’s inference effort.

Download Full-text

A new correlation-based approach for ensemble selection in random forests

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-10-2020-0147 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Mostafa El Habib Daho ◽

Nesma Settouti ◽

Mohammed El Amine Bechar ◽

Amina Boublenza ◽

Mohammed Amine Chikh

Keyword(s):

State Of The Art ◽

Ensemble Methods ◽

The State ◽

Ensemble Classifiers ◽

Content Type ◽

Pruning Method ◽

Ensemble Selection ◽

Small Ensemble ◽

Short Time ◽

Pruning Techniques

PurposeEnsemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.Design/methodology/approachIn this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.FindingsThe proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.Originality/valueCES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.

Download Full-text