Correlator convolutional neural networks as an interpretable architecture for image-like quantum matter data

AbstractImage-like data from quantum systems promises to offer greater insight into the physics of correlated quantum matter. However, the traditional framework of condensed matter physics lacks principled approaches for analyzing such data. Machine learning models are a powerful theoretical tool for analyzing image-like data including many-body snapshots from quantum simulators. Recently, they have successfully distinguished between simulated snapshots that are indistinguishable from one and two point correlation functions. Thus far, the complexity of these models has inhibited new physical insights from such approaches. Here, we develop a set of nonlinearities for use in a neural network architecture that discovers features in the data which are directly interpretable in terms of physical observables. Applied to simulated snapshots produced by two candidate theories approximating the doped Fermi-Hubbard model, we uncover that the key distinguishing features are fourth-order spin-charge correlators. Our approach lends itself well to the construction of simple, versatile, end-to-end interpretable architectures, thus paving the way for new physical insights from machine learning studies of experimental and numerical data.

Download Full-text

A machine learning method for generation of a neural network architecture: a continuous ID3 algorithm

IEEE Transactions on Neural Networks ◽

10.1109/72.125869 ◽

1992 ◽

Vol 3 (2) ◽

pp. 280-291 ◽

Cited By ~ 71

Author(s):

K.J. Cios ◽

N. Liu

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Architecture ◽

Machine Learning Method ◽

Learning Method ◽

Neural Network Architecture ◽

Id3 Algorithm

Download Full-text

Mutual Information Scaling for Tensor Network Machine Learning

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ac44a9 ◽

2021 ◽

Author(s):

Ian Convy ◽

William Huggins ◽

Haoran Liao ◽

K Birgitta Whaley

Keyword(s):

Machine Learning ◽

Mutual Information ◽

Many Body ◽

Learning Tasks ◽

Tensor Networks ◽

Tensor Network ◽

Logistic Regression Algorithm ◽

Entanglement Structure ◽

Insight Into ◽

Specific Learning

Abstract Tensor networks have emerged as promising tools for machine learning, inspired by their widespread use as variational ansatze in quantum many-body physics. It is well known that the success of a given tensor network ansatz depends in part on how well it can reproduce the underlying entanglement structure of the target state, with different network designs favoring different scaling patterns. We demonstrate here how a related correlation analysis can be applied to tensor network machine learning, and explore whether classical data possess correlation scaling patterns similar to those found in quantum states which might indicate the best network to use for a given dataset. We utilize mutual information as measure of correlations in classical data, and show that it can serve as a lower-bound on the entanglement needed for a probabilistic tensor network classifier. We then develop a logistic regression algorithm to estimate the mutual information between bipartitions of data features, and verify its accuracy on a set of Gaussian distributions designed to mimic different correlation patterns. Using this algorithm, we characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter. This quantum-inspired classical analysis offers insight into the design of tensor networks which are best suited for specific learning tasks.

Download Full-text

Designing deep neural networks for continual learning in an open world

10.21248/gups.62487 ◽

2021 ◽

Author(s):

◽

Martin Mundt

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Network Architecture ◽

Neural Network Training ◽

Neural Network Architecture ◽

Neural Architecture ◽

Network Training ◽

Classification Tasks ◽

Continual Learning

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.

Download Full-text

A Novel Multi Hidden Layer Convolutional Neural Network for Content Based Image Retrieval

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c4771.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 365-370

Keyword(s):

Neural Network ◽

Machine Learning ◽

Image Retrieval ◽

Convolutional Neural Network ◽

Network Architecture ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Content Based Image Retrieval ◽

Neural Network Architecture ◽

Query Image

The applications of a content-based image retrieval system in fields such as multimedia, security, medicine, and entertainment, have been implemented on a huge real-time database by using a convolutional neural network architecture. In general, thus far, content-based image retrieval systems have been implemented with machine learning algorithms. A machine learning algorithm is applicable to a limited database because of the few feature extraction hidden layers between the input and the output layers. The proposed convolutional neural network architecture was successfully implemented using 128 convolutional layers, pooling layers, rectifier linear unit (ReLu), and fully connected layers. A convolutional neural network architecture yields better results of its ability to extract features from an image. The Euclidean distance metric is used for calculating the similarity between the query image and the database images. It is implemented using the COREL database. The proposed system is successfully evaluated using precision, recall, and F-score. The performance of the proposed method is evaluated using the precision and recall.

Download Full-text

LESIONING MCCLOSKEY AND LINDEMANN’S (1992) MATHNET: THE EFFECT OF DAMAGE LOCATION AND AMOUNT

Journal of Biological System ◽

10.1142/s0218339094000209 ◽

1994 ◽

Vol 02 (03) ◽

pp. 335-356 ◽

Cited By ~ 1

Author(s):

G. LORIES ◽

A. AUBRUN ◽

X. SERON

Keyword(s):

Neural Network ◽

Brain Damage ◽

Network Architecture ◽

Error Pattern ◽

Neural Network Architecture ◽

Damage Location ◽

Error Distributions ◽

Fact Retrieval ◽

Insight Into ◽

Simple Network

McCloskey and Lindemann [32] provide a simulation of brain damage on a neural network architecture and offer evidence that different lesions to a same network can lead to different error distributions. We briefly review the various kinds of networks that have been proposed to simulate various arithmetical fact retrieval phenomena and we present a simple network designed to make some computational constraints apparent. Additionally, we replicate McCloskey and Lindemann’s [32] simulation by training 5 different artificial “subjects” and inflicting various types of damage upon each. Examination of the behaviour of our version of the network after different amounts of damage to its various connection blocks confirms that the error pattern may vary. These variations in the error patterns can be analyzed. The data may help to clarify the functioning of the network and give insight into the reasons why it produces several effects observed in human behaviour.

Download Full-text

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

IoT ◽

10.3390/iot2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 222-235

Author(s):

Guillaume Coiffier ◽

Ghouthi Boukli Hacene ◽

Vincent Gripon

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Spatial Resolution ◽

Network Architecture ◽

Deep Neural Networks ◽

State Of The Art ◽

Feature Maps ◽

Neural Network Architecture

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Download Full-text

A self-supervised, physics-aware, Bayesian neural network architecture for modelling galaxy emission-line kinematics

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stab427 ◽

2021 ◽

Author(s):

James M Dawson ◽

Timothy A Davis ◽

Edward L Gomez ◽

Justus Schock

Keyword(s):

Neural Network ◽

Machine Learning ◽

Emission Line ◽

Network Architecture ◽

Large Datasets ◽

Physical Parameters ◽

Bayesian Neural Network ◽

Neural Network Architecture ◽

Bayesian Monte Carlo ◽

Kinematic Modelling

Abstract In the upcoming decades large facilities, such as the SKA, will provide resolved observations of the kinematics of millions of galaxies. In order to assist in the timely exploitation of these vast datasets we blackexplore the use of a self-supervised, physics aware neural network capable of Bayesian kinematic modelling of galaxies. We demonstrate the network’s ability to model the kinematics of cold gas in galaxies with an emphasis on recovering physical parameters and accompanying modelling errors. The model is able to recover rotation curves, inclinations and disc scale lengths for both CO and H i data which match well with those found in the literature. The model is also able to provide modelling errors over learned parameters thanks to the application of quasi-Bayesian Monte-Carlo dropout. This work shows the promising use of machine learning, and in particular self-supervised neural networks, in the context of kinematically modelling galaxies. This work represents the first steps in applying such models for kinematic fitting and we propose that variants of our model would seem especially suitable for enabling emission-line science from upcoming surveys with e.g. the SKA, allowing fast exploitation of these large datasets.

Download Full-text

Conservation laws in a neural network architecture: Enforcing the atom balance of a Julia-based photochemical model (v0.2.0)

10.5194/gmd-2021-402 ◽

2021 ◽

Author(s):

Patrick Obin Sturm ◽

Anthony S. Wexler

Keyword(s):

Neural Network ◽

Conservation Laws ◽

Network Architecture ◽

Neural Network Architecture ◽

Machine Precision ◽

Emissions Scenarios ◽

Hidden Layer ◽

Computational Resources ◽

Atmospheric Phenomena ◽

Insight Into

Abstract. Models of atmospheric phenomena provide insight into climate, air quality, and meteorology, and provide a mechanism for understanding the effect of future emissions scenarios. To accurately represent atmospheric phenomena, these models consume vast quantities of computational resources. Machine learning (ML) techniques such as neural networks have the potential to emulate compute-intensive components of these models to reduce their computational burden. However, such ML surrogate models may lead to nonphysical predictions that are difficult to uncover. Here we present a neural network architecture that enforces conservation laws. Instead of simply predicting properties of interest, a physically interpretable hidden layer within the network predicts fluxes between properties which are subsequently related to the properties of interest. As an example, we design a physics-constrained neural network surrogate model of photochemistry using this approach and find that it conserves atoms as they flow between molecules to machine precision, while outperforming a naïve neural network in terms of accuracy and non-negativity of concentrations.

Download Full-text

USE OF DEEP MACHINE LEARNING METHODS OF ARTIFICIAL NEURAL NETWORKS FOR DESIGNING ALGORITHMS OF ELECTROMYOGRAPHY SIGNAL RECOGNITION IN BIONIC PROSTHESIS

Issues of radio electronics ◽

10.21778/2218-5453-2019-5-64-75 ◽

2019 ◽

pp. 64-75

Author(s):

A. A. Yarygin ◽

B. H. Aytbaev ◽

A. Yu. Kanyshev ◽

E. A. Alekseeva

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Architecture ◽

Skin Surface ◽

Signal Recognition ◽

End User ◽

Neural Network Architecture ◽

Activity Data ◽

The Moment ◽

Electromyography Signals

For sterling application of scientific and engineered achievements in field of bionic prosthesis it’s required to provide comfortable and natural human‑prosthesis interface for an end‑user. In this article we are looking into ways and methods of analysis of the signal collected through electromyography activity of muscles on the skin surface. Such signal is nonstationary and unstable by its nature, dependent on various factors. sEMG based interface has several unsolved problem at the moment, such as insufficient accuracy of recognition and noticeable delay caused by signal recognition and processing. Article is dedicated to application of deep machine learning required to provide decent recognition of electromyography signals. In the course of the research hardware was developed to register muscle activity. Data collecting system and algorithms of gesture recognition have been designed as well. In conclusion decent results were achieved by using convolutional neural network, with two‑dimensional input, since data stream has obvious translational orientation. In the future, modification of neural network architecture, learning algorithms and experiments with structure of data are planned.

Download Full-text

Fairness-Aware Neural Rényi Minimization for Continuous Features

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/313 ◽

2020 ◽

Author(s):

Vincent Grari ◽

Sylvain Lamprier ◽

Marcin Detyniecki

Keyword(s):

Neural Network ◽

Machine Learning ◽

Network Architecture ◽

Regression Models ◽

Discrete Variables ◽

Neural Network Architecture ◽

The Past ◽

Maximal Correlation ◽

Dramatic Rise ◽

Societal Interest

The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. While plenty of fair algorithms have been proposed recently to tackle this challenge for discrete variables, only a few ideas exist for continuous ones. The objective in this paper is to ensure some independence level between the outputs of regression models and any given continuous sensitive variables. For this purpose, we use the Hirschfeld-Gebelein-Rényi (HGR) maximal correlation coefficient as a fairness metric. We propose to minimize the HGR coefficient directly with an adversarial neural network architecture. The idea is to predict the output Y while minimizing the ability of an adversarial neural network to find the estimated transformations which are required to predict the HGR coefficient. We empirically assess and compare our approach and demonstrate significant improvements on previously presented work in the field.

Download Full-text