Maximum entropy methods for extracting the learned features of deep neural networks

AbstractNew architectures of multilayer artificial neural networks and new methods for training them are rapidly revolutionizing the application of machine learning in diverse fields, including business, social science, physical sciences, and biology. Interpreting deep neural networks, however, currently remains elusive, and a critical challenge lies in understanding which meaningful features a network is actually learning. We present a general method for interpreting deep neural networks and extracting network-learned features from input data. We describe our algorithm in the context of biological sequence analysis. Our approach, based on ideas from statistical physics, samples from the maximum entropy distribution over possible sequences, anchored at an input sequence and subject to constraints implied by the empirical function learned by a network. Using our framework, we demonstrate that local transcription factor binding motifs can be identified from a network trained on ChIP-seq data and that nucleosome positioning signals are indeed learned by a network trained on chemical cleavage nucleosome maps. Imposing a further constraint on the maximum entropy distribution also allows us to probe whether a network is learning global sequence features, such as the high GC content in nucleosome-rich regions. This work thus provides valuable mathematical tools for interpreting and extracting learned features from feed-forward neural networks.

Download Full-text

Maximum entropy methods for extracting the learned features of deep neural networks

PLoS Computational Biology ◽

10.1371/journal.pcbi.1005836 ◽

2017 ◽

Vol 13 (10) ◽

pp. e1005836 ◽

Cited By ~ 17

Author(s):

Alex Finnegan ◽

Jun S. Song

Keyword(s):

Neural Networks ◽

Maximum Entropy ◽

Deep Neural Networks ◽

Entropy Methods ◽

Learned Features

Download Full-text

A Scalable System-on-Chip Acceleration for Deep Neural Networks

IEEE Access ◽

10.1109/access.2021.3094675 ◽

2021 ◽

pp. 1-1

Author(s):

Faisal Shehzad ◽

Muhammad Rashid ◽

Mohammed H Sinky ◽

Saud S Alotaibi ◽

Muhammad Yousuf Irfan Zia

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

System On Chip ◽

Scalable System ◽

On Chip

Download Full-text

Power, Performance, and Area Benefit of Monolithic 3D ICs for On-Chip Deep Neural Networks Targeting Speech Recognition

ACM Journal on Emerging Technologies in Computing Systems ◽

10.1145/3273956 ◽

2018 ◽

Vol 14 (4) ◽

pp. 1-19

Author(s):

Kyungwook Chang ◽

Deepak Kadetotad ◽

Yu Cao ◽

Jae-Sun Seo ◽

Sung Kyu Lim

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Deep Neural Networks ◽

Power Performance ◽

3D Ics ◽

On Chip

Download Full-text

Analyzing networks-on-chip based deep neural networks

Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip - NOCS '19 ◽

10.1145/3313231.3352375 ◽

2019 ◽

Cited By ~ 2

Author(s):

Giuseppe Ascia ◽

Vincenzo Catania ◽

Salvatore Monteleone ◽

Maurizio Palesi ◽

Davide Patti ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Networks On Chip ◽

On Chip

Download Full-text

SIAM: Chiplet-based Scalable In-Memory Acceleration with Mesh for Deep Neural Networks

ACM Transactions on Embedded Computing Systems ◽

10.1145/3476999 ◽

2021 ◽

Vol 20 (5s) ◽

pp. 1-24

Author(s):

Gokul Krishnan ◽

Sumit K. Mandal ◽

Manvitha Pannala ◽

Chaitali Chakrabarti ◽

Jae-Sun Seo ◽

...

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Design Space Exploration ◽

Deep Neural Networks ◽

Feasible Solution ◽

Computing System ◽

Efficient Design ◽

Simulation Speed ◽

Wide Range ◽

On Chip

In-memory computing (IMC) on a monolithic chip for deep learning faces dramatic challenges on area, yield, and on-chip interconnection cost due to the ever-increasing model sizes. 2.5D integration or chiplet-based architectures interconnect multiple small chips (i.e., chiplets) to form a large computing system, presenting a feasible solution beyond a monolithic IMC architecture to accelerate large deep learning models. This paper presents a new benchmarking simulator, SIAM, to evaluate the performance of chiplet-based IMC architectures and explore the potential of such a paradigm shift in IMC architecture design. SIAM integrates device, circuit, architecture, network-on-chip (NoC), network-on-package (NoP), and DRAM access models to realize an end-to-end system. SIAM is scalable in its support of a wide range of deep neural networks (DNNs), customizable to various network structures and configurations, and capable of efficient design space exploration. We demonstrate the flexibility, scalability, and simulation speed of SIAM by benchmarking different state-of-the-art DNNs with CIFAR-10, CIFAR-100, and ImageNet datasets. We further calibrate the simulation results with a published silicon result, SIMBA. The chiplet-based IMC architecture obtained through SIAM shows 130 and 72 improvement in energy-efficiency for ResNet-50 on the ImageNet dataset compared to Nvidia V100 and T4 GPUs.

Download Full-text

Broad Autoencoder Features Learning for Classification Problem

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001oa10 ◽

2021 ◽

Vol 15 (4) ◽

pp. 0-0

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Classification Problem ◽

Activation Function ◽

Activation Functions ◽

Classification Problems ◽

Stacked Autoencoders ◽

Learned Features ◽

Sigmoid Functions ◽

Nonlinear Mappings

Activation functions such as Tanh and Sigmoid functions are widely used in Deep Neural Networks (DNNs) and pattern classification problems. To take advantages of different activation functions, the Broad Autoencoder Features (BAF) is proposed in this work. The BAF consists of four parallel-connected Stacked Autoencoders (SAEs) and each of them uses a different activation function, including Sigmoid, Tanh, ReLU, and Softplus. The final learned features can merge such features by various nonlinear mappings from original input features with such a broad setting. This helps to excavate more information from the original input features. Experimental results show that the BAF yields better-learned features and classification performances.

Download Full-text

On-chip training of memristor based deep neural networks

2017 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2017.7966300 ◽

2017 ◽

Cited By ~ 15

Author(s):

Raqibul Hasan ◽

Tarek M. Taha ◽

Chris Yakopcic

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

On Chip

Download Full-text

DANoC: An Efficient Algorithm and Hardware Codesign of Deep Neural Networks on Chip

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2017.2717442 ◽

2017 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Xichuan Zhou ◽

Shengli Li ◽

Fang Tang ◽

Shengdong Hu ◽

Zhi Lin ◽

...

Keyword(s):

Neural Networks ◽

Efficient Algorithm ◽

Deep Neural Networks ◽

Networks On Chip ◽

On Chip

Download Full-text

Probabilistic Models with Deep Neural Networks

Entropy ◽

10.3390/e23010117 ◽

2021 ◽

Vol 23 (1) ◽

pp. 117

Author(s):

Andrés R. Masegosa ◽

Rafael Cabañas ◽

Helge Langseth ◽

Thomas D. Nielsen ◽

Antonio Salmerón

Keyword(s):

Neural Networks ◽

Learning Community ◽

Statistical Physics ◽

Deep Neural Networks ◽

Probabilistic Models ◽

Broad Class ◽

Probabilistic Inference ◽

Probabilistic Modeling ◽

Stochastic Gradient Descent ◽

Modeling Framework

Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.

Download Full-text

FPGA based implementation of deep neural networks using on-chip memory only

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2016.7471828 ◽

2016 ◽

Cited By ~ 24

Author(s):

Jinhwan Park ◽

Wonyong Sung

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

On Chip

Download Full-text