Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks

With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions—one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC architecture with both NoC-tree and NoC-mesh. We conclude that, at the tile level, NoC-tree is appropriate for compact DNNs employed at the edge, and NoC-mesh is necessary to accelerate DNNs with high connection density. Furthermore, we propose a technique to determine the optimal choice of interconnect for any given DNN. In this technique, we use analytical models of NoC to evaluate end-to-end communication latency of any given DNN. We demonstrate that the interconnect optimization in the IMC architecture results in up to 6 × improvement in energy-delay-area product for VGG-19 inference compared to the state-of-the-art ReRAM-based IMC architectures.

Download Full-text

A Scalable System-on-Chip Acceleration for Deep Neural Networks

IEEE Access ◽

10.1109/access.2021.3094675 ◽

2021 ◽

pp. 1-1

Author(s):

Faisal Shehzad ◽

Muhammad Rashid ◽

Mohammed H Sinky ◽

Saud S Alotaibi ◽

Muhammad Yousuf Irfan Zia

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

System On Chip ◽

Scalable System ◽

On Chip

Download Full-text

Tuning Hyperparameters of Machine Learning Algorithms and Deep Neural Networks Using Metaheuristics: A Bioinformatics Study on Biomedical and Biological Cases

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2021.107619 ◽

2021 ◽

pp. 107619

Author(s):

Sajjad Nematzadeh ◽

Farzad Kiani ◽

Mahsa Torkamanian-Afshar ◽

Nizamettin Aydin

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Bioinformatics Study

Download Full-text

EMOTIONS RECOGNITION IN HUMAN SPEECH USING DEEP NEURAL NETWORKS

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2021.01.pp.044-051 ◽

2021 ◽

pp. 44-51

Author(s):

E. Yu. Shchetinin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Audio Recordings ◽

Computer Studies

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.

Download Full-text

Power, Performance, and Area Benefit of Monolithic 3D ICs for On-Chip Deep Neural Networks Targeting Speech Recognition

ACM Journal on Emerging Technologies in Computing Systems ◽

10.1145/3273956 ◽

2018 ◽

Vol 14 (4) ◽

pp. 1-19

Author(s):

Kyungwook Chang ◽

Deepak Kadetotad ◽

Yu Cao ◽

Jae-Sun Seo ◽

Sung Kyu Lim

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Deep Neural Networks ◽

Power Performance ◽

3D Ics ◽

On Chip

Download Full-text

Detecting Emotions in English and Arabic Tweets

Information ◽

10.3390/info10030098 ◽

2019 ◽

Vol 10 (3) ◽

pp. 98 ◽

Cited By ~ 4

Author(s):

Tariq Ahmad ◽

Allan Ramsay ◽

Hanady Ahmed

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

State Of The Art ◽

Learning Algorithms ◽

General Purpose ◽

Machine Learning Algorithms ◽

Current State ◽

Optimal Thresholds ◽

Alternative Approach

Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms.

Download Full-text

Fully Convolutional Deep Neural Networks with Optimized Hyperparameters for Detection of Shockable and Non-Shockable Rhythms

Sensors ◽

10.3390/s20102875 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2875 ◽

Cited By ~ 1

Author(s):

Vessela Krasteva ◽

Sarah Ménétré ◽

Jean-Philippe Didon ◽

Irena Jekova

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Random Search ◽

Computational Cost ◽

Dropout Rate ◽

Machine Learning Algorithms ◽

Dense Layer ◽

Max Pooling ◽

Resuscitation Guidelines ◽

Advisory Systems

Deep neural networks (DNN) are state-of-the-art machine learning algorithms that can be learned to self-extract significant features of the electrocardiogram (ECG) and can generally provide high-output diagnostic accuracy if subjected to robust training and optimization on large datasets at high computational cost. So far, limited research and optimization of DNNs in shock advisory systems is found on large ECG arrhythmia databases from out-of-hospital cardiac arrests (OHCA). The objective of this study is to optimize the hyperparameters (HPs) of deep convolutional neural networks (CNN) for detection of shockable (Sh) and nonshockable (NSh) rhythms, and to validate the best HP settings for short and long analysis durations (2–10 s). Large numbers of (Sh + NSh) ECG samples were used for training (720 + 3170) and validation (739 + 5921) from Holters and defibrillators in OHCA. An end-to-end deep CNN architecture was implemented with one-lead raw ECG input layer (5 s, 125 Hz, 2.5 uV/LSB), configurable number of 5 to 23 hidden layers and output layer with diagnostic probability p ∈ [0: Sh,1: NSh]. The hidden layers contain N convolutional blocks × 3 layers (Conv1D (filters = Fi, kernel size = Ki), max-pooling (pool size = 2), dropout (rate = 0.3)), one global max-pooling and one dense layer. Random search optimization of HPs = {N, Fi, Ki}, i = 1, … N in a large grid of N = [1, 2, … 7], Fi = [5;50], Ki = [5;100] was performed. During training, the model with maximal balanced accuracy BAC = (Sensitivity + Specificity)/2 over 400 epochs was stored. The optimization principle is based on finding the common HPs space of a few top-ranked models and prediction of a robust HP setting by their median value. The optimal models for 1–7 CNN layers were trained with different learning rates LR = [10−5; 10−2] and the best model was finally validated on 2–10 s analysis durations. A number of 4216 random search models were trained. The optimal models with more than three convolutional layers did not exhibit substantial differences in performance BAC = (99.31–99.5%). Among them, the best model was found with {N = 5, Fi = {20, 15, 15, 10, 5}, Ki = {10, 10, 10, 10, 10}, 7521 trainable parameters} with maximal validation performance for 5-s analysis (BAC = 99.5%, Se = 99.6%, Sp = 99.4%) and tolerable drop in performance (<2% points) for very short 2-s analysis (BAC = 98.2%, Se = 97.6%, Sp = 98.7%). DNN application in future-generation shock advisory systems can improve the detection performance of Sh and NSh rhythms and can considerably shorten the analysis duration complying with resuscitation guidelines for minimal hands-off pauses.

Download Full-text

An Analysis of Machine Learning Algorithms and Deep Neural Networks for Email Spam Classification using Natural Language Processing

10.1109/soli54607.2021.9672398 ◽

2021 ◽

Author(s):

Md. Mohidul Hasan ◽

Syed Mahbubuz Zaman ◽

Md. Asif Talukdar ◽

Ayesha Siddika ◽

Md. Golam Rabiul Alam

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Email Spam

Download Full-text

Taala Classification in Carnatic Music Using Machine Learning Algorithms and Deep Neural Networks

Advanced Computing and Intelligent Technologies - Lecture Notes in Networks and Systems ◽

10.1007/978-981-16-2164-2_29 ◽

2021 ◽

pp. 355-371

Author(s):

Amrutha Jayakumar ◽

Maya L. Pai

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Carnatic Music

Download Full-text

Analyzing networks-on-chip based deep neural networks

Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip - NOCS '19 ◽

10.1145/3313231.3352375 ◽

2019 ◽

Cited By ~ 2

Author(s):

Giuseppe Ascia ◽

Vincenzo Catania ◽

Salvatore Monteleone ◽

Maurizio Palesi ◽

Davide Patti ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Networks On Chip ◽

On Chip

Download Full-text

SIAM: Chiplet-based Scalable In-Memory Acceleration with Mesh for Deep Neural Networks

ACM Transactions on Embedded Computing Systems ◽

10.1145/3476999 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-24

Author(s):

Gokul Krishnan ◽

Sumit K. Mandal ◽

Manvitha Pannala ◽

Chaitali Chakrabarti ◽

Jae-Sun Seo ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Design Space Exploration ◽

Deep Neural Networks ◽

Feasible Solution ◽

Computing System ◽

Efficient Design ◽

Simulation Speed ◽

Wide Range ◽

On Chip

In-memory computing (IMC) on a monolithic chip for deep learning faces dramatic challenges on area, yield, and on-chip interconnection cost due to the ever-increasing model sizes. 2.5D integration or chiplet-based architectures interconnect multiple small chips (i.e., chiplets) to form a large computing system, presenting a feasible solution beyond a monolithic IMC architecture to accelerate large deep learning models. This paper presents a new benchmarking simulator, SIAM, to evaluate the performance of chiplet-based IMC architectures and explore the potential of such a paradigm shift in IMC architecture design. SIAM integrates device, circuit, architecture, network-on-chip (NoC), network-on-package (NoP), and DRAM access models to realize an end-to-end system. SIAM is scalable in its support of a wide range of deep neural networks (DNNs), customizable to various network structures and configurations, and capable of efficient design space exploration. We demonstrate the flexibility, scalability, and simulation speed of SIAM by benchmarking different state-of-the-art DNNs with CIFAR-10, CIFAR-100, and ImageNet datasets. We further calibrate the simulation results with a published silicon result, SIMBA. The chiplet-based IMC architecture obtained through SIAM shows 130 and 72 improvement in energy-efficiency for ResNet-50 on the ImageNet dataset compared to Nvidia V100 and T4 GPUs.

Download Full-text