scholarly journals Correlator convolutional neural networks as an interpretable architecture for image-like quantum matter data

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Cole Miles ◽  
Annabelle Bohrdt ◽  
Ruihan Wu ◽  
Christie Chiu ◽  
Muqing Xu ◽  
...  

AbstractImage-like data from quantum systems promises to offer greater insight into the physics of correlated quantum matter. However, the traditional framework of condensed matter physics lacks principled approaches for analyzing such data. Machine learning models are a powerful theoretical tool for analyzing image-like data including many-body snapshots from quantum simulators. Recently, they have successfully distinguished between simulated snapshots that are indistinguishable from one and two point correlation functions. Thus far, the complexity of these models has inhibited new physical insights from such approaches. Here, we develop a set of nonlinearities for use in a neural network architecture that discovers features in the data which are directly interpretable in terms of physical observables. Applied to simulated snapshots produced by two candidate theories approximating the doped Fermi-Hubbard model, we uncover that the key distinguishing features are fourth-order spin-charge correlators. Our approach lends itself well to the construction of simple, versatile, end-to-end interpretable architectures, thus paving the way for new physical insights from machine learning studies of experimental and numerical data.

Author(s):  
Ian Convy ◽  
William Huggins ◽  
Haoran Liao ◽  
K Birgitta Whaley

Abstract Tensor networks have emerged as promising tools for machine learning, inspired by their widespread use as variational ansatze in quantum many-body physics. It is well known that the success of a given tensor network ansatz depends in part on how well it can reproduce the underlying entanglement structure of the target state, with different network designs favoring different scaling patterns. We demonstrate here how a related correlation analysis can be applied to tensor network machine learning, and explore whether classical data possess correlation scaling patterns similar to those found in quantum states which might indicate the best network to use for a given dataset. We utilize mutual information as measure of correlations in classical data, and show that it can serve as a lower-bound on the entanglement needed for a probabilistic tensor network classifier. We then develop a logistic regression algorithm to estimate the mutual information between bipartitions of data features, and verify its accuracy on a set of Gaussian distributions designed to mimic different correlation patterns. Using this algorithm, we characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter. This quantum-inspired classical analysis offers insight into the design of tensor networks which are best suited for specific learning tasks.


2021 ◽  
Author(s):  
◽  
Martin Mundt

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.


The applications of a content-based image retrieval system in fields such as multimedia, security, medicine, and entertainment, have been implemented on a huge real-time database by using a convolutional neural network architecture. In general, thus far, content-based image retrieval systems have been implemented with machine learning algorithms. A machine learning algorithm is applicable to a limited database because of the few feature extraction hidden layers between the input and the output layers. The proposed convolutional neural network architecture was successfully implemented using 128 convolutional layers, pooling layers, rectifier linear unit (ReLu), and fully connected layers. A convolutional neural network architecture yields better results of its ability to extract features from an image. The Euclidean distance metric is used for calculating the similarity between the query image and the database images. It is implemented using the COREL database. The proposed system is successfully evaluated using precision, recall, and F-score. The performance of the proposed method is evaluated using the precision and recall.


1994 ◽  
Vol 02 (03) ◽  
pp. 335-356 ◽  
Author(s):  
G. LORIES ◽  
A. AUBRUN ◽  
X. SERON

McCloskey and Lindemann [32] provide a simulation of brain damage on a neural network architecture and offer evidence that different lesions to a same network can lead to different error distributions. We briefly review the various kinds of networks that have been proposed to simulate various arithmetical fact retrieval phenomena and we present a simple network designed to make some computational constraints apparent. Additionally, we replicate McCloskey and Lindemann’s [32] simulation by training 5 different artificial “subjects” and inflicting various types of damage upon each. Examination of the behaviour of our version of the network after different amounts of damage to its various connection blocks confirms that the error pattern may vary. These variations in the error patterns can be analyzed. The data may help to clarify the functioning of the network and give insight into the reasons why it produces several effects observed in human behaviour.


IoT ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 222-235
Author(s):  
Guillaume Coiffier ◽  
Ghouthi Boukli Hacene ◽  
Vincent Gripon

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.


Author(s):  
James M Dawson ◽  
Timothy A Davis ◽  
Edward L Gomez ◽  
Justus Schock

Abstract In the upcoming decades large facilities, such as the SKA, will provide resolved observations of the kinematics of millions of galaxies. In order to assist in the timely exploitation of these vast datasets we blackexplore the use of a self-supervised, physics aware neural network capable of Bayesian kinematic modelling of galaxies. We demonstrate the network’s ability to model the kinematics of cold gas in galaxies with an emphasis on recovering physical parameters and accompanying modelling errors. The model is able to recover rotation curves, inclinations and disc scale lengths for both CO and H i data which match well with those found in the literature. The model is also able to provide modelling errors over learned parameters thanks to the application of quasi-Bayesian Monte-Carlo dropout. This work shows the promising use of machine learning, and in particular self-supervised neural networks, in the context of kinematically modelling galaxies. This work represents the first steps in applying such models for kinematic fitting and we propose that variants of our model would seem especially suitable for enabling emission-line science from upcoming surveys with e.g. the SKA, allowing fast exploitation of these large datasets.


2021 ◽  
Author(s):  
Patrick Obin Sturm ◽  
Anthony S. Wexler

Abstract. Models of atmospheric phenomena provide insight into climate, air quality, and meteorology, and provide a mechanism for understanding the effect of future emissions scenarios. To accurately represent atmospheric phenomena, these models consume vast quantities of computational resources. Machine learning (ML) techniques such as neural networks have the potential to emulate compute-intensive components of these models to reduce their computational burden. However, such ML surrogate models may lead to nonphysical predictions that are difficult to uncover. Here we present a neural network architecture that enforces conservation laws. Instead of simply predicting properties of interest, a physically interpretable hidden layer within the network predicts fluxes between properties which are subsequently related to the properties of interest. As an example, we design a physics-constrained neural network surrogate model of photochemistry using this approach and find that it conserves atoms as they flow between molecules to machine precision, while outperforming a naïve neural network in terms of accuracy and non-negativity of concentrations.


2019 ◽  
pp. 64-75
Author(s):  
A. A. Yarygin ◽  
B. H. Aytbaev ◽  
A. Yu. Kanyshev ◽  
E. A. Alekseeva

For sterling application of scientific and engineered achievements in field of bionic prosthesis it’s required to provide comfortable  and natural human‑prosthesis interface for an end‑user. In this article we are looking into ways and methods of analysis of the  signal collected through electromyography activity of muscles on the skin surface. Such signal is nonstationary and unstable  by  its  nature,  dependent  on  various  factors.  sEMG  based  interface  has  several  unsolved  problem  at  the  moment,  such  as  insufficient accuracy of recognition and noticeable delay caused by signal recognition and processing. Article is dedicated to  application of deep machine learning required to provide decent recognition of electromyography signals. In the course of the  research hardware was developed to register muscle activity. Data collecting system and algorithms of gesture recognition have  been designed as well. In conclusion decent results were achieved by using convolutional neural network, with two‑dimensional input, since data stream has obvious translational orientation. In the future, modification of neural network architecture, learning  algorithms and experiments with structure of data are planned.


Author(s):  
Vincent Grari ◽  
Sylvain Lamprier ◽  
Marcin Detyniecki

The past few years have seen a dramatic rise of academic and societal interest in fair machine learning. While plenty of fair algorithms have been proposed recently to tackle this challenge for discrete variables, only a few ideas exist for continuous ones. The objective in this paper is to ensure some independence level between the outputs of regression models and any given continuous sensitive variables. For this purpose, we use the Hirschfeld-Gebelein-Rényi (HGR) maximal correlation coefficient as a fairness metric. We propose to minimize the HGR coefficient directly with an adversarial neural network architecture. The idea is to predict the output Y while minimizing the ability of an adversarial neural network to find the estimated transformations which are required to predict the HGR coefficient. We empirically assess and compare our approach and demonstrate significant improvements on previously presented work in the field.


Sign in / Sign up

Export Citation Format

Share Document