Memory Optimization Techniques in Neural Networks: A Review

Deep neural networks have been continuously evolving towards larger and more complex models to solve challenging problems in the field of AI. The primary bottleneck that restricts new network architectures is memory consumption. Running or training DNNs heavily relies on the hardware (CPUs, GPUs, or FPGA) which are either inadequate in terms of memory or hard-to-extend. This would further make it difficult to scale. In this paper, we review some of the latest memory footprint reduction techniques which would enable faster low model complexity. Additionally, it improves accuracy by increasing the batch size and developing wider and deeper neural networks with the same set of hardware resources. The paper emphasizes on memory optimization methods specific to CNN and RNN training.

Download Full-text

Deploying and scaling distributed parallel deep neural networks on the Tianhe-3 prototype system

Scientific Reports ◽

10.1038/s41598-021-98794-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jia Wei ◽

Xingjun Zhang ◽

Zeyu Ji ◽

Jingbo Li ◽

Zheng Wei

Keyword(s):

Neural Networks ◽

Large Scale ◽

Model Complexity ◽

Prototype System ◽

Communication Optimization ◽

Optimization Strategy ◽

Computing Power ◽

Synchronization Process ◽

Complex Models ◽

Model Training

AbstractDue to the increase in computing power, it is possible to improve the feature extraction and data fitting capabilities of DNN networks by increasing their depth and model complexity. However, the big data and complex models greatly increase the training overhead of DNN, so accelerating their training process becomes a key task. The Tianhe-3 peak speed is designed to target E-class, and the huge computing power provides a potential opportunity for DNN training. We implement and extend LeNet, AlexNet, VGG, and ResNet model training for a single MT-2000+ and FT-2000+ compute nodes, as well as extended multi-node clusters, and propose an improved gradient synchronization process for Dynamic Allreduce communication optimization strategy for the gradient synchronization process base on the ARM architecture features of the Tianhe-3 prototype, providing experimental data and theoretical basis for further enhancing and improving the performance of the Tianhe-3 prototype in large-scale distributed training of neural networks.

Download Full-text

Binarized Neural Architecture Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6624 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10526-10533 ◽

Cited By ~ 1

Author(s):

Hanlin Chen ◽

Li'an Zhuo ◽

Baochang Zhang ◽

Xiawu Zheng ◽

Jianzhuang Liu ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

State Of The Art ◽

Optimization Methods ◽

Search Space ◽

Network Architectures ◽

Neural Architecture ◽

Space Reduction ◽

The Cost ◽

A Performance

Neural architecture search (NAS) can have a significant impact in computer vision by automatically designing optimal neural network architectures for various tasks. A variant, binarized neural architecture search (BNAS), with a search space of binarized convolutions, can produce extremely compressed models. Unfortunately, this area remains largely unexplored. BNAS is more challenging than NAS due to the learning inefficiency caused by optimization requirements and the huge architecture space. To address these issues, we introduce channel sampling and operation space reduction into a differentiable NAS to significantly reduce the cost of searching. This is accomplished through a performance-based strategy used to abandon less potential operations. Two optimization methods for binarized neural networks are used to validate the effectiveness of our BNAS. Extensive experiments demonstrate that the proposed BNAS achieves a performance comparable to NAS on both CIFAR and ImageNet databases. An accuracy of 96.53% vs. 97.22% is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a 40% faster search than the state-of-the-art PC-DARTS.

Download Full-text

Hydrological modelling using artificial neural networks

Progress in Physical Geography Earth and Environment ◽

10.1177/030913330102500104 ◽

2001 ◽

Vol 25 (1) ◽

pp. 80-108 ◽

Cited By ~ 438

Author(s):

C. W. Dawson ◽

R. L. Wilby

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Model Performance ◽

Model Complexity ◽

Network Architectures ◽

Training Algorithms ◽

Rainfall Runoff ◽

Ann Model ◽

Validation Data ◽

Artificial Neural

This review considers the application of artificial neural networks (ANNs) to rainfall-runoff modelling and flood forecasting. This is an emerging field of research, characterized by a wide variety of techniques, a diversity of geographical contexts, a general absence of intermodel comparisons, and inconsistent reporting of model skill. This article begins by outlining the basic principles of ANN modelling, common network architectures and training algorithms. The discussion then addresses related themes of the division and preprocessing of data for model calibration/validation; data standardization techniques; and methods of evaluating ANN model performance. A literature survey underlines the need for clear guidance in current modelling practice, as well as the comparison of ANN methods with more conventional statistical models. Accordingly, a template is proposed in order to assist the construction of future ANN rainfall-runoff models. Finally, it is suggested that research might focus on the extraction of hydrological ‘rules’ from ANN weights, and on the development of standard performance measures that penalize unnecessary model complexity.

Download Full-text

Evolutionary-Fuzzy-Integral-Based Convolutional Neural Networks for Facial Image Classification

Electronics ◽

10.3390/electronics8090997 ◽

2019 ◽

Vol 8 (9) ◽

pp. 997 ◽

Cited By ~ 4

Author(s):

Lin ◽

Sun ◽

Wang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Network Architecture ◽

Optimization Methods ◽

Optimization Method ◽

Fuzzy Integral ◽

Network Architectures ◽

Age And Gender ◽

Density Values ◽

And Gender

Various optimization methods and network architectures are used by convolutional neural networks (CNNs). Each optimization method and network architecture style have their own advantages and representation abilities. To make the most of these advantages, evolutionary-fuzzy-integral-based convolutional neural networks (EFI-CNNs) are proposed in this paper. The proposed EFI-CNNs were verified by way of face classification of age and gender. The trained CNNs’ outputs were set as inputs of a fuzzy integral. The classification results were operated using either Sugeno or Choquet output rules. The conventional fuzzy density values of the fuzzy integral were decided by heuristic experiments. In this paper, particle swarm optimization (PSO) was used to adaptively find optimal fuzzy density values. To combine the advantages of each CNN type, the evaluation of each CNN type in EFI-CNNs is necessary. Three CNN structures, AlexNet, very deep convolutional neural network (VGG16), and GoogLeNet, and three databases, computational intelligence application laboratory (CIA), Morph, and cross-age celebrity dataset (CACD2000), were used in experiments to classify age and gender. The experimental results show that the proposed method achieved 5.95% and 3.1% higher accuracy, respectively, in classifying age and gender.

Download Full-text

Color Space Transformation using Neural Networks

Color and Imaging Conference ◽

10.2352/issn.2169-2629.2019.27.29 ◽

2019 ◽

Vol 2019 (1) ◽

pp. 153-158

Author(s):

Lindsay MacDonald

Keyword(s):

Neural Network ◽

Neural Networks ◽

Color Space ◽

Reflectance Spectra ◽

Network Architectures ◽

Color Spaces ◽

Natural Materials ◽

Space Transformation ◽

Color Space Transformation

We investigated how well a multilayer neural network could implement the mapping between two trichromatic color spaces, specifically from camera R,G,B to tristimulus X,Y,Z. For training the network, a set of 800,000 synthetic reflectance spectra was generated. For testing the network, a set of 8,714 real reflectance spectra was collated from instrumental measurements on textiles, paints and natural materials. Various network architectures were tested, with both linear and sigmoidal activations. Results show that over 85% of all test samples had color errors of less than 1.0 ΔE2000 units, much more accurate than could be achieved by regression.

Download Full-text

TLBO-FLN: Teaching-Learning Based Optimization of Functional Link Neural Networks for Stock Closing Price Prediction

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327909666191202113015 ◽

2020 ◽

Vol 10 (4) ◽

pp. 522-532 ◽

Cited By ~ 1

Author(s):

Sarat Chandra Nayak ◽

Subhranginee Das ◽

Mohammad Dilsad Ansari

Keyword(s):

Neural Networks ◽

Computational Cost ◽

Optimization Techniques ◽

Fine Tuning ◽

Functional Link ◽

Price Prediction ◽

Closing Price ◽

Teaching Learning Based Optimization ◽

Artificial Neural ◽

Teaching Learning

Background and Objective: Stock closing price prediction is enormously complicated. Artificial Neural Networks (ANN) are excellent approximation algorithms applied to this area. Several nature-inspired evolutionary optimization techniques are proposed and used in the literature to search the optimum parameters of ANN based forecasting models. However, most of them need fine-tuning of several control parameters as well as algorithm specific parameters to achieve optimal performance. Improper tuning of such parameters either leads toward additional computational cost or local optima. Methods: Teaching Learning Based Optimization (TLBO) is a newly proposed algorithm which does not necessitate any parameters specific to it. The intrinsic capability of Functional Link Artificial Neural Network (FLANN) to recognize the multifaceted nonlinear relationship present in the historical stock data made it popular and got wide applications in the stock market prediction. This article presents a hybrid model termed as Teaching Learning Based Optimization of Functional Neural Networks (TLBO-FLN) by combining the advantages of both TLBO and FLANN. Results and Conclusion: The model is evaluated by predicting the short, medium, and long-term closing prices of four emerging stock markets. The performance of the TLBO-FLN model is measured through Mean Absolute Percentage of Error (MAPE), Average Relative Variance (ARV), and coefficient of determination (R2); compared with that of few other state-of-the-art models similarly trained and found superior.

Download Full-text

Kinetic mixing, dark photons and extra dimensions. Part III. Brane localized dark matter

Journal of High Energy Physics ◽

10.1007/jhep03(2021)173 ◽

2021 ◽

Vol 2021 (3) ◽

Author(s):

Thomas G. Rizzo ◽

George N. Wojcik

Keyword(s):

Dark Matter ◽

Extra Dimensions ◽

Model Building ◽

New Physics ◽

Model Complexity ◽

Complex Scalar ◽

Dark Photon ◽

Complex Models ◽

The Cost ◽

D Model

Abstract Extra dimensions have proven to be a very useful tool in constructing new physics models. In earlier work, we began investigating toy models for the 5-D analog of the kinetic mixing/vector portal scenario where the interactions of dark matter, taken to be, e.g., a complex scalar, with the brane-localized fields of the Standard Model (SM) are mediated by a massive U(1)D dark photon living in the bulk. These models were shown to have many novel features differentiating them from their 4-D analogs and which, in several cases, avoided some well-known 4-D model building constraints. However, these gains were obtained at the cost of the introduction of a fair amount of model complexity, e.g., dark matter Kaluza-Klein excitations. In the present paper, we consider an alternative setup wherein the dark matter and the dark Higgs, responsible for U(1)D breaking, are both localized to the ‘dark’ brane at the opposite end of the 5-D interval from where the SM fields are located with only the dark photon now being a 5-D field. The phenomenology of such a setup is explored for both flat and warped extra dimensions and compared to the previous more complex models.

Download Full-text

Recurrent Neural Networks for Edge Intelligence

ACM Computing Surveys ◽

10.1145/3448974 ◽

2021 ◽

Vol 54 (4) ◽

pp. 1-38

Author(s):

Varsha S. Lalapura ◽

J. Amudha ◽

Hariramn Selvamuruga Satheesh

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Model Performance ◽

Network Architectures ◽

Training Procedure ◽

Resource Constrained ◽

Compression Technique ◽

Creative Art ◽

Comprehensive Survey ◽

Intelligent Models

Recurrent Neural Networks are ubiquitous and pervasive in many artificial intelligence applications such as speech recognition, predictive healthcare, creative art, and so on. Although they provide accurate superior solutions, they pose a massive challenge “training havoc.” Current expansion of IoT demands intelligent models to be deployed at the edge. This is precisely to handle increasing model sizes and complex network architectures. Design efforts to meet these for greater performance have had inverse effects on portability on edge devices with real-time constraints of memory, latency, and energy. This article provides a detailed insight into various compression techniques widely disseminated in the deep learning regime. They have become key in mapping powerful RNNs onto resource-constrained devices. While compression of RNNs is the main focus of the survey, it also highlights challenges encountered while training. The training procedure directly influences model performance and compression alongside. Recent advancements to overcome the training challenges with their strengths and drawbacks are discussed. In short, the survey covers the three-step process, namely, architecture selection, efficient training process, and suitable compression technique applicable to a resource-constrained environment. It is thus one of the comprehensive survey guides a developer can adapt for a time-series problem context and an RNN solution for the edge.

Download Full-text