Study on a visual coder acceleration algorithm for image classification applying dynamic scaling training techniques
Abstract Nowadays, image classification techniques are used in the field of autonomous vehicles, and Convolutional Neural Network (CNN) is used extensively, and Vision Transformer (ViT) networks are used instead of deep convolutional networks in order to compress the network size and improve the model accuracy. The ViT network is used to replace the deep convolutional network. Since training ViT requires a large dataset to have sufficient accuracy, a variant of ViT, Data-Efficient Image Transformers (DEIT), is used in this paper. In addition, in order to greatly reduce the computing memory and shorten the computing time in practical use, the network is flexibly scaled in size and training speed by both adaptive width and adaptive depth. In this paper, we introduce DEIT, width adaptive techniques and depth adaptive techniques and combine them to be applied to image classification examples. Experiments are conducted on the Cifar100 dataset, and the experiments demonstrate the superiority of the algorithm on image classification scenarios.