Toward accurate platform-aware performance modeling for deep neural networks

In this paper, we provide a fine-grain machine learning-based method, PerfNetV2, which improves the accuracy of our previous work for modeling the neural network performance on a variety of GPU accelerators. Given an application, the proposed method can be used to predict the inference time and training time of the convolutional neural networks used in the application, which enables the system developer to optimize the performance by choosing the neural networks and/or incorporating the hardware accelerators to deliver satisfactory results in time. Furthermore, the proposed method is capable of predicting the performance of an unseen or non-existing device, e.g. a new GPU which has a higher operating frequency with less processor cores, but more memory capacity. This allows a system developer to quickly search the hardware design space and/or fine-tune the system configuration. Compared to the previous works, PerfNetV2 delivers more accurate results by modeling detailed host-accelerator interactions in executing the full neural networks and improving the architecture of the machine learning model used in the predictor. Our case studies show that PerfNetV2 yields a mean absolute percentage error within 13.1% on LeNet, AlexNet, and VGG16 on NVIDIA GTX-1080Ti, while the error rate on a previous work published in ICBD 2018 could be as large as 200%.

Download Full-text

Review and comparative analysis of machine learning libraries for machine learning

Discrete and Continuous Models and Applied Computational Science ◽

10.22363/2658-4670-2019-27-4-305-315 ◽

2019 ◽

Vol 27 (4) ◽

pp. 305-315

Author(s):

Migran N. Gevorkyan ◽

Anastasia V. Demidova ◽

Tatiana S. Demidova ◽

Anton A. Sobolev

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Comparative Analysis ◽

Programming Languages ◽

Multilayer Perceptron ◽

Tabular Form ◽

Training Time ◽

Current State ◽

Different Types ◽

The Neural Networks

The article is an overview. We carry out the comparison of actual machine learning libraries that can be used the neural networks development. The first part of the article gives a brief description of TensorFlow, PyTorch, Theano, Keras, SciKit Learn libraries, SciPy library stack. An overview of the scope of these libraries and the main technical characteristics, such as performance, supported programming languages, the current state of development is given. In the second part of the article, a comparison of five libraries is carried out on the example of a multilayer perceptron, which is applied to the problem of handwritten digits recognizing. This problem is well known and well suited for testing different types of neural networks. The study time is compared depending on the number of epochs and the accuracy of the classifier. The results of the comparison are presented in the form of graphs of training time and accuracy depending on the number of epochs and in tabular form.

Download Full-text

Evaluating the Impact of Optical Interconnects on a Multi-Chip Machine-Learning Architecture

Electronics ◽

10.3390/electronics7080130 ◽

2018 ◽

Vol 7 (8) ◽

pp. 130 ◽

Cited By ~ 1

Author(s):

Yuhwan Ro ◽

Eojin Lee ◽

Jung Ahn

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Optical Interconnects ◽

Performance Model ◽

Training Time ◽

Performance Improvements ◽

Cluster Architecture ◽

The Neural Network ◽

The Impact

Following trends that emphasize neural networks for machine learning, many studies regarding computing systems have focused on accelerating deep neural networks. These studies often propose utilizing the accelerator specialized in a neural network and the cluster architecture composed of interconnected accelerator chips. We observed that inter-accelerator communication within a cluster has a significant impact on the training time of the neural network. In this paper, we show the advantages of optical interconnects for multi-chip machine-learning architecture by demonstrating performance improvements through replacing electrical interconnects with optical ones in an existing multi-chip system. We propose to use highly practical optical interconnect implementation and devise an arithmetic performance model to fairly assess the impact of optical interconnects on a machine-learning accelerator platform. In our evaluation of nine Convolutional Neural Networks with various input sizes, 100 and 400 Gbps optical interconnects reduce the training time by an average of 20.6% and 35.6%, respectively, compared to the baseline system with 25.6 Gbps electrical ones.

Download Full-text

Discretization and machine learning approximation of BSDEs with a constraint on the Gains-process

Monte Carlo Methods and Applications ◽

10.1515/mcma-2020-2080 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Idris Kharroubi ◽

Thomas Lim ◽

Xavier Warin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Differential Equations ◽

Numerical Experiments ◽

Optimization Problem ◽

Learning Approach ◽

The Neural Network ◽

Machine Learning Approach ◽

Mesh Grid

AbstractWe study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments.

Download Full-text

Federated Quantum Machine Learning

Entropy ◽

10.3390/e23040460 ◽

2021 ◽

Vol 23 (4) ◽

pp. 460

Author(s):

Samuel Yen-Chi Chen ◽

Shinjae Yoo

Keyword(s):

Machine Learning ◽

Data Privacy ◽

Research Direction ◽

Future Research ◽

Quantum Computers ◽

Training Time ◽

Quantum Neural Network ◽

Distributed Training ◽

Machine Learning Model ◽

Quantum Machine Learning

Distributed training across several quantum computers could significantly improve the training time and if we could share the learned model, not the data, it could potentially improve the data privacy as the training would happen where the data is located. One of the potential schemes to achieve this property is the federated learning (FL), which consists of several clients or local nodes learning on their own data and a central node to aggregate the models collected from those local nodes. However, to the best of our knowledge, no work has been done in quantum machine learning (QML) in federation setting yet. In this work, we present the federated training on hybrid quantum-classical machine learning models although our framework could be generalized to pure quantum machine learning model. Specifically, we consider the quantum neural network (QNN) coupled with classical pre-trained convolutional model. Our distributed federated learning scheme demonstrated almost the same level of trained model accuracies and yet significantly faster distributed training. It demonstrates a promising future research direction for scaling and privacy aspects.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

NEURAL NETWORK ANALYSIS FOR TUMOR INVESTIGATION AND CANCER PREDICTION

Journal of Electronics and Informatics - September 2019 ◽

10.36548/jei.2019.2.004 ◽

2019 ◽

Vol 2019 (02) ◽

pp. 89-98

Author(s):

Vijayakumar T

Keyword(s):

Neural Network ◽

Neural Networks ◽

Early Stage ◽

Original Data ◽

Data Set ◽

Cancer Prediction ◽

The Neural Network ◽

Feed Forward Neural Networks ◽

The Neural Networks ◽

Human Nervous System

Predicting the category of tumors and the types of the cancer in its early stage remains as a very essential process to identify depth of the disease and treatment available for it. The neural network that functions similar to the human nervous system is widely utilized in the tumor investigation and the cancer prediction. The paper presents the analysis of the performance of the neural networks such as the, FNN (Feed Forward Neural Networks), RNN (Recurrent Neural Networks) and the CNN (Convolutional Neural Network) investigating the tumors and predicting the cancer. The results obtained by evaluating the neural networks on the breast cancer Wisconsin original data set shows that the CNN provides 43 % better prediction than the FNN and 25% better prediction than the RNN.

Download Full-text

Analysis and Design of Associative Memories Based on Recurrent Neural Networks with Linear Saturation Activation Functions and Time-Varying Delays

Neural Computation ◽

10.1162/neco.2007.19.8.2149 ◽

2007 ◽

Vol 19 (8) ◽

pp. 2149-2182 ◽

Cited By ~ 53

Author(s):

Zhigang Zeng ◽

Jun Wang

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Sufficient Conditions ◽

Time Varying ◽

Activation Functions ◽

Analysis And Design ◽

Design Procedures ◽

The Neural Network ◽

Saturation Region ◽

The Neural Networks

In this letter, some sufficient conditions are obtained to guarantee recurrent neural networks with linear saturation activation functions, and time-varying delays have multiequilibria located in the saturation region and the boundaries of the saturation region. These results on pattern characterization are used to analyze and design autoassociative memories, which are directly based on the parameters of the neural networks. Moreover, a formula for the numbers of spurious equilibria is also derived. Four design procedures for recurrent neural networks with linear saturation activation functions and time-varying delays are developed based on stability results. Two of these procedures allow the neural network to be capable of learning and forgetting. Finally, simulation results demonstrate the validity and characteristics of the proposed approach.

Download Full-text

Signal processing algorithm for neural networks with integrodifferential splines as an activation function and its particular case of image classification

Highly available systems ◽

10.18127/j20729472-202102-02 ◽

2021 ◽

Author(s):

T.K. Biryukova

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Classification ◽

Activation Function ◽

Experimental Comparison ◽

Training Time ◽

Operation Speed ◽

The Neural Network ◽

Linear Algebraic Equations ◽

Network Operation

Classic neural networks suppose trainable parameters to include just weights of neurons. This paper proposes parabolic integrodifferential splines (ID-splines), developed by author, as a new kind of activation function (AF) for neural networks, where ID-splines coefficients are also trainable parameters. Parameters of ID-spline AF together with weights of neurons are vary during the training in order to minimize the loss function thus reducing the training time and increasing the operation speed of the neural network. The newly developed algorithm enables software implementation of the ID-spline AF as a tool for neural networks construction, training and operation. It is proposed to use the same ID-spline AF for neurons in the same layer, but different for different layers. In this case, the parameters of the ID-spline AF for a particular layer change during the training process independently of the activation functions (AFs) of other network layers. In order to comply with the continuity condition for the derivative of the parabolic ID-spline on the interval (x x0, n) , its parameters fi (i= 0,...,n) should be calculated using the tridiagonal system of linear algebraic equations: To solve the system it is necessary to use two more equations arising from the boundary conditions for specific problems. For exam- ple the values of the grid function (if they are known) in the points (x x0, n) may be used for solving the system above: f f x0 = ( 0) , f f xn = ( n) . The parameters Iii+1 (i= 0,...,n−1 ) are used as trainable parameters of neural networks. The grid boundaries and spacing of the nodes of ID-spline AF are best chosen experimentally. The optimal selection of grid nodes allows improving the quality of results produced by the neural network. The formula for a parabolic ID-spline is such that the complexity of the calculations does not depend on whether the grid of nodes is uniform or non-uniform. An experimental comparison of the results of image classification from the popular FashionMNIST dataset by convolutional neural 0, x< 0 networks with the ID-spline AFs and the well-known ReLUx( ) =AF was carried out. The results reveal that the usage x x, ≥ 0 of the ID-spline AFs provides better accuracy of neural network operation than the ReLU AF. The training time for two convolutional layers network with two ID-spline AFs is just about 2 times longer than with two instances of ReLU AF. Doubling of the training time due to complexity of the ID-spline formula is the acceptable price for significantly better accuracy of the network. Wherein the difference of an operation speed of the networks with ID-spline and ReLU AFs will be negligible. The use of trainable ID-spline AFs makes it possible to simplify the architecture of neural networks without losing their efficiency. The modification of the well-known neural networks (ResNet etc.) by replacing traditional AFs with ID-spline AFs is a promising approach to increase the neural network operation accuracy. In a majority of cases, such a substitution does not require to train the network from scratch because it allows to use pre-trained on large datasets neuron weights supplied by standard software libraries for neural network construction thus substantially shortening training time.

Download Full-text

Dynamics of Neural Networks as Nonlinear Systems with Several Equilibria

Advancing Artificial Intelligence through Biological Process Applications ◽

10.4018/978-1-59904-996-0.ch018 ◽

2011 ◽

pp. 331-357 ◽

Cited By ~ 6

Author(s):

Daniela Danciu

Keyword(s):

Neural Network ◽

Dynamical System ◽

Neural Networks ◽

Time Delay ◽

Dynamical Systems ◽

Qualitative Properties ◽

The Neural Network ◽

The Neural Networks ◽

Global Asymptotics ◽

Dynamical Properties

Neural networks—both natural and artificial, are characterized by two kinds of dynamics. The first one is concerned with what we would call “learning dynamics”. The second one is the intrinsic dynamics of the neural network viewed as a dynamical system after the weights have been established via learning. The chapter deals with the second kind of dynamics. More precisely, since the emergent computational capabilities of a recurrent neural network can be achieved provided it has suitable dynamical properties when viewed as a system with several equilibria, the chapter deals with those qualitative properties connected to the achievement of such dynamical properties as global asymptotics and gradient-like behavior. In the case of the neural networks with delays, these aspects are reformulated in accordance with the state of the art of the theory of time delay dynamical systems.

Download Full-text

Convolutional Neural Network

10.4018/978-1-6684-2408-7.ch077 ◽

2022 ◽

pp. 1559-1575

Author(s):

Mário Pereira Véstias

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Artificial Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Model ◽

Artificial Neural

Machine learning is the study of algorithms and models for computing systems to do tasks based on pattern identification and inference. When it is difficult or infeasible to develop an algorithm to do a particular task, machine learning algorithms can provide an output based on previous training data. A well-known machine learning model is deep learning. The most recent deep learning models are based on artificial neural networks (ANN). There exist several types of artificial neural networks including the feedforward neural network, the Kohonen self-organizing neural network, the recurrent neural network, the convolutional neural network, the modular neural network, among others. This article focuses on convolutional neural networks with a description of the model, the training and inference processes and its applicability. It will also give an overview of the most used CNN models and what to expect from the next generation of CNN models.

Download Full-text