Machine Learning: Science and Technology
Latest Publications


TOTAL DOCUMENTS

252
(FIVE YEARS 252)

H-INDEX

5
(FIVE YEARS 5)

Published By IOP Publishing

2632-2153

Author(s):  
Maxim Ziatdinov ◽  
Ayana Ghosh ◽  
Sergei V Kalinin

Abstract Both experimental and computational methods for the exploration of structure, functionality, and properties of materials often necessitate the search across broad parameter spaces to discover optimal experimental conditions and regions of interest in the image space or parameter space of computational models. The direct grid search of the parameter space tends to be extremely time-consuming, leading to the development of strategies balancing exploration of unknown parameter spaces and exploitation towards required performance metrics. However, classical Bayesian optimization strategies based on the Gaussian process (GP) do not readily allow for the incorporation of the known physical behaviors or past knowledge. Here we explore a hybrid optimization/exploration algorithm created by augmenting the standard GP with a structured probabilistic model of the expected system’s behavior. This approach balances the flexibility of the non-parametric GP approach with a rigid structure of physical knowledge encoded into the parametric model. The fully Bayesian treatment of the latter allows additional control over the optimization via the selection of priors for the model parameters. The method is demonstrated for a noisy version of the classical objective function used to evaluate optimization algorithms and further extended to physical lattice models. This methodology is expected to be universally suitable for injecting prior knowledge in the form of physical models and past data in the Bayesian optimization framework.


Author(s):  
Samuel Genheden ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

Abstract We expand the recent work on clustering of synthetic routes and train a deep learning model to predict the distances between arbitrary routes. The model is based on an long short-term memory (LSTM) representation of a synthetic route and is trained as a twin network to reproduce the tree edit distance (TED) between two routes. The ML approach is approximately two orders of magnitude faster than the TED approach and enables clustering many more routes from a retrosynthesis route prediction. The clusters have a high degree of similarity to the clusters given by the TED-based approach and are accordingly intuitive and explainable. We provide the developed model as open-source.


Author(s):  
Stephen Burns Menary ◽  
Darren David Price

Abstract We show that density models describing multiple observables with (i) hard boundaries and (ii) dependence on external parameters may be created using an auto-regressive Gaussian mixture model. The model is designed to capture how observable spectra are deformed by hypothesis variations, and is made more expressive by projecting data onto a configurable latent space. It may be used as a statistical model for scientific discovery in interpreting experimental observations, for example when constraining the parameters of a physical model or tuning simulation parameters according to calibration data. The model may also be sampled for use within a Monte Carlo simulation chain, or used to estimate likelihood ratios for event classification. The method is demonstrated on simulated high-energy particle physics data considering the anomalous electroweak production of a $Z$ boson in association with a dijet system at the Large Hadron Collider, and the accuracy of inference is tested using a realistic toy example. The developed methods are domain agnostic; they may be used within any field to perform simulation or inference where a dataset consisting of many real-valued observables has conditional dependence on external parameters.


Author(s):  
Maximilian Paul Niroomand ◽  
Conor T Cafolla ◽  
John William Roger Morgan ◽  
David J Wales

Abstract One of the most common metrics to evaluate neural network classifiers is the area under the receiver operating characteristic curve (AUC). However, optimisation of the AUC as the loss function during network training is not a standard procedure. Here we compare minimising the cross-entropy (CE) loss and optimising the AUC directly. In particular, we analyse the loss function landscape (LFL) of approximate AUC (appAUC) loss functions to discover the organisation of this solution space. We discuss various surrogates for AUC approximation and show their differences. We find that the characteristics of the appAUC landscape are significantly different from the CE landscape. The approximate AUC loss function improves testing AUC, and the appAUC landscape has substantially more minima, but these minima are less robust, with larger average Hessian eigenvalues. We provide a theoretical foundation to explain these results. To generalise our results, we lastly provide an overview of how the LFL can help to guide loss function analysis and selection.


Author(s):  
Sergei Manzhos ◽  
Eita Sasaki ◽  
Manabu Ihara

Abstract We show that Gaussian process regression (GPR) allows representing multivariate functions with low-dimensional terms via kernel design. When using a kernel built with HDMR (High-dimensional model representation), one obtains a similar type of representation as the previously proposed HDMR-GPR scheme while being faster and simpler to use. We tested the approach on cases where highly accurate machine learning is required from sparse data by fitting potential energy surfaces and kinetic energy densities.


Author(s):  
Siddharth Mishra-Sharma

Abstract Astrometry---the precise measurement of positions and motions of celestial objects---has emerged as a promising avenue for characterizing the dark matter population in our Galaxy. By leveraging recent advances in simulation-based inference and neural network architectures, we introduce a novel method to search for global dark matter-induced gravitational lensing signatures in astrometric datasets. Our method based on neural likelihood-ratio estimation shows significantly enhanced sensitivity to a cold dark matter population and more favorable scaling with measurement noise compared to existing approaches based on two-point correlation statistics, establishing machine learning as a powerful tool for characterizing dark matter using astrometric data.


Author(s):  
Chenhua Geng ◽  
Hong-Ye Hu ◽  
Yijian Zou

Abstract Differentiable programming is a new programming paradigm which enables large scale optimization through automatic calculation of gradients also known as auto-differentiation. This concept emerges from deep learning, and has also been generalized to tensor network optimizations. Here, we extend the differentiable programming to tensor networks with isometric constraints with applications to multiscale entanglement renormalization ansatz (MERA) and tensor network renormalization (TNR). By introducing several gradient-based optimization methods for the isometric tensor network and comparing with Evenbly-Vidal method, we show that auto-differentiation has a better performance for both stability and accuracy. We numerically tested our methods on 1D critical quantum Ising spin chain and 2D classical Ising model. We calculate the ground state energy for the 1D quantum model and internal energy for the classical model, and scaling dimensions of scaling operators and find they all agree with the theory well.


Author(s):  
Samuel Yen-Chi Chen ◽  
Chih-Min Huang ◽  
Chia-Wei Hsing ◽  
Hsi-Sheng Goan ◽  
Ying-Jer Kao

Abstract Recent advance in classical reinforcement learning (RL) and quantum computation (QC) points to a promising direction of performing RL on a quantum computer. However, potential applications in quantum RL are limited by the number of qubits available in modern quantum devices. Here we present two frameworks of deep quantum RL tasks using a gradient-free evolution optimization: First, we apply the amplitude encoding scheme to the Cart-Pole problem, where we demonstrate the quantum advantage of parameter saving using the amplitude encoding; Second, we propose a hybrid framework where the quantum RL agents are equipped with a hybrid tensor network-variational quantum circuit (TN-VQC) architecture to handle inputs of dimensions exceeding the number of qubits. This allows us to perform quantum RL on the MiniGrid environment with 147-dimensional inputs. The hybrid TN-VQC architecture provides a natural way to perform efficient compression of the input dimension, enabling further quantum RL applications on noisy intermediate-scale quantum devices.


Author(s):  
Ian Convy ◽  
William Huggins ◽  
Haoran Liao ◽  
K Birgitta Whaley

Abstract Tensor networks have emerged as promising tools for machine learning, inspired by their widespread use as variational ansatze in quantum many-body physics. It is well known that the success of a given tensor network ansatz depends in part on how well it can reproduce the underlying entanglement structure of the target state, with different network designs favoring different scaling patterns. We demonstrate here how a related correlation analysis can be applied to tensor network machine learning, and explore whether classical data possess correlation scaling patterns similar to those found in quantum states which might indicate the best network to use for a given dataset. We utilize mutual information as measure of correlations in classical data, and show that it can serve as a lower-bound on the entanglement needed for a probabilistic tensor network classifier. We then develop a logistic regression algorithm to estimate the mutual information between bipartitions of data features, and verify its accuracy on a set of Gaussian distributions designed to mimic different correlation patterns. Using this algorithm, we characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter. This quantum-inspired classical analysis offers insight into the design of tensor networks which are best suited for specific learning tasks.


Author(s):  
Diogo R. Ferreira ◽  
Tiago A. Martins ◽  
Paulo Rodrigues

Abstract In the nuclear fusion community, there are many specialized techniques to analyze the data coming from a variety of diagnostics. One of such techniques is the use of spectrograms to analyze the magnetohydrodynamic (MHD) behavior of fusion plasmas. Physicists look at the spectrogram to identify the oscillation modes of the plasma, and to study instabilities that may lead to plasma disruptions. One of the major causes of disruptions occurs when an oscillation mode interacts with the wall, stops rotating, and becomes a locked mode. In this work, we use deep learning to predict the occurrence of locked modes from MHD spectrograms. In particular, we use a Convolutional Neural Network (CNN) with Class Activation Mapping (CAM) to pinpoint the exact behavior that the model thinks is responsible for the locked mode. Surprisingly, we find that, in general, the model explanation agrees quite well with the physical interpretation of the behavior observed in the spectrogram.


Sign in / Sign up

Export Citation Format

Share Document