Differentiable strong lensing: uniting gravity and neural nets through differentiable probabilistic programming

ABSTRACT Since upcoming telescopes will observe thousands of strong lensing systems, creating fully automated analysis pipelines for these images becomes increasingly important. In this work, we make a step towards that direction by developing the first end-to-end differentiable strong lensing pipeline. Our approach leverages and combines three important computer science developments: (i) convolutional neural networks (CNNs), (ii) efficient gradient-based sampling techniques, and (iii) deep probabilistic programming languages. The latter automatize parameter inference and enable the combination of generative deep neural networks and physics components in a single model. In the current work, we demonstrate that it is possible to combine a CNN trained on galaxy images as a source model with a fully differentiable and exact implementation of gravitational lensing physics in a single probabilistic model. This does away with hyperparameter tuning for the source model, enables the simultaneous optimization of nearly 100 source and lens parameters with gradient-based methods, and allows the use of efficient gradient-based posterior sampling techniques. These features make this automated inference pipeline potentially suitable for processing a large amount of data. By analysing mock lensing systems with different signal-to-noise ratios, we show that lensing parameters are reconstructed with per cent-level accuracy. More generally, we consider this work as one of the first steps in establishing differentiable probabilistic programming techniques in the particle astrophysics community, which have the potential to significantly accelerate and improve many complex data analysis tasks.

Download Full-text

Learning quantized neural nets by coarse gradient method for nonlinear classification

Research in the Mathematical Sciences ◽

10.1007/s40687-021-00281-4 ◽

2021 ◽

Vol 8 (3) ◽

Author(s):

Ziang Long ◽

Penghang Yin ◽

Jack Xin

Keyword(s):

Neural Networks ◽

Loss Function ◽

Ad Hoc ◽

Gradient Methods ◽

Synthetic Data ◽

Neural Nets ◽

Theoretical Understanding ◽

Performance Guarantees ◽

Almost Everywhere ◽

Gradient Based

AbstractQuantized or low-bit neural networks are attractive due to their inference efficiency. However, training deep neural networks with quantized activations involves minimizing a discontinuous and piecewise constant loss function. Such a loss function has zero gradient almost everywhere (a.e.), which makes the conventional gradient-based algorithms inapplicable. To this end, we study a novel class of biased first-order oracle, termed coarse gradient, for overcoming the vanished gradient issue. A coarse gradient is generated by replacing the a.e. zero derivative of quantized (i.e., staircase) ReLU activation composited in the chain rule with some heuristic proxy derivative called straight-through estimator (STE). Although having been widely used in training quantized networks empirically, fundamental questions like when and why the ad hoc STE trick works, still lack theoretical understanding. In this paper, we propose a class of STEs with certain monotonicity and consider their applications to the training of a two-linear-layer network with quantized activation functions for nonlinear multi-category classification. We establish performance guarantees for the proposed STEs by showing that the corresponding coarse gradient methods converge to the global minimum, which leads to a perfect classification. Lastly, we present experimental results on synthetic data as well as MNIST dataset to verify our theoretical findings and demonstrate the effectiveness of our proposed STEs.

Download Full-text

Improving Adversarial Attacks on Deep Neural Networks via Constricted Gradient-based Perturbations

Information Sciences ◽

10.1016/j.ins.2021.04.033 ◽

2021 ◽

Author(s):

Yatie Xiao ◽

Chi-Man Pun

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Gradient Based

Download Full-text

Road Extraction from Unmanned Aerial Vehicle Remote Sensing Images Based on Improved Neural Networks

Sensors ◽

10.3390/s19194115 ◽

2019 ◽

Vol 19 (19) ◽

pp. 4115 ◽

Cited By ~ 1

Author(s):

Yuxia Li ◽

Bo Peng ◽

Lei He ◽

Kunlong Fan ◽

Zhenxu Li ◽

...

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Unmanned Aerial Vehicle ◽

Computational Efficiency ◽

Neural Nets ◽

Road Extraction ◽

Remote Sensing Images ◽

Feature Maps ◽

Aerial Vehicle

Roads are vital components of infrastructure, the extraction of which has become a topic of significant interest in the field of remote sensing. Because deep learning has been a popular method in image processing and information extraction, researchers have paid more attention to extracting road using neural networks. This article proposes the improvement of neural networks to extract roads from Unmanned Aerial Vehicle (UAV) remote sensing images. D-Linknet was first considered for its high performance; however, the huge scale of the net reduced computational efficiency. With a focus on the low computational efficiency problem of the popular D-LinkNet, this article made some improvements: (1) Replace the initial block with a stem block. (2) Rebuild the entire network based on ResNet units with a new structure, allowing for the construction of an improved neural network D-Linknetplus. (3) Add a 1 × 1 convolution layer before DBlock to reduce the input feature maps, reducing parameters and improving computational efficiency. Add another 1 × 1 convolution layer after DBlock to recover the required number of output channels. Accordingly, another improved neural network B-D-LinknetPlus was built. Comparisons were performed between the neural nets, and the verification were made with the Massachusetts Roads Dataset. The results show improved neural networks are helpful in reducing the network size and developing the precision needed for road extraction.

Download Full-text

Comparing gradient based learning methods for optimizing predictive neural networks

2014 Recent Advances in Engineering and Computational Sciences (RAECS) ◽

10.1109/raecs.2014.6799573 ◽

2014 ◽

Cited By ~ 1

Author(s):

Dharminder Kumar ◽

Sangeeta Gupta ◽

Parveen Sehgal

Keyword(s):

Neural Networks ◽

Learning Methods ◽

Gradient Based

Download Full-text

Building hydrological single-model ensembles using artificial neural networks and a combinatorial optimization approach

10.5194/egusphere-egu21-8256 ◽

2021 ◽

Author(s):

Juan F. Farfán-Durán ◽

Luis Cea

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Goodness Of Fit ◽

Hydrological Model ◽

Hill Climbing ◽

Single Model ◽

Pearson Coefficient ◽

Gradient Based ◽

Artificial Neural ◽

Model Ensembles

In recent years, the application of model ensembles has received increasing attention in the hydrological modelling community due to the interesting results reported in several studies carried out in different parts of the world. The main idea of these approaches is to combine the results of the same hydrological model or a number of different hydrological models in order to obtain more robust, better-fitting models, reducing at the same time the uncertainty in the predictions. The techniques for combining models range from simple approaches such as averaging different simulations, to more complex techniques such as least squares, genetic algorithms and more recently artificial intelligence techniques such as Artificial Neural Networks (ANN).Despite the good results that model ensembles are able to provide, the models selected to build the ensemble have a direct influence on the results. Contrary to intuition, it has been reported that the best fitting single models do not necessarily produce the best ensemble. Instead, better results can be obtained with ensembles that incorporate models with moderate goodness of fit. This implies that the selection of the single models might have a random component in order to maximize the results that ensemble approaches can provide.The present study is carried out using hydrological data on an hourly scale between 2008 and 2015 corresponding to the Mandeo basin, located in the Northwest of Spain. In order to obtain 1000 single models, a hydrological model was run using 1000 sets of parameters sampled randomly in their feasible space. Then, we have classified the models in 3 groups with the following characteristics: 1) The 25 single models with highest Nash-Sutcliffe coefficient, 2) The 25 single models with the highest Pearson coefficient, and 3) The complete group of 1000 single models.The ensemble models are built with 5 models as the input of an ANN and the observed series as the output. Then, we applied the Random-Restart Hill-Climbing (RRHC) algorithm choosing 5 random models in each iteration to re-train the ANN in order to identify a better ensemble. The algorithm is applied to build 50 ensembles in each group of models. Finally, the results are compared to those obtained by optimizing the model using a gradient-based method by means of the following goodness-of-fit measures: Nash-Sutcliffe (NSE) coefficient, adapted for high flows Nash-Sutcliffe (HF&#8722;NSE), adapted for low flows Nash-Sutcliffe (LF&#8722;W NSE) and coefficient of determination (R2).The results show that the RRHC algorithm can identify adequate ensembles. The ensembles built using the group of models selected based on the NSE outperformed the model optimized by the gradient method in 64 % of the cases in at least 3 of 4 coefficients, both in the calibration and validation stages. Followed by the ensembles built with the group of models selected based on the Pearson coefficient with 56 %. In the case of the third group, no ensembles were identified that outperformed the gradient-based method. However, the most part of the ensembles outperformed the 1000 individual models.Keywords: Multi-model ensemble; Single-model ensemble; Artificial Neural Networks; Hydrological Model; Random-restart Hill-climbing&#160;

Download Full-text

Simultaneous optimization of thermal and electrical conductivity of high density polyethylene-carbon particle composites by artificial neural networks and multi-objective genetic algorithm

Computational Materials Science ◽

10.1016/j.commatsci.2021.110956 ◽

2022 ◽

Vol 201 ◽

pp. 110956

Author(s):

Miguel García-Carrillo ◽

Adriana B. Espinoza-Martínez ◽

Luis F. Ramos-de Valle ◽

Saúl Sánchez-Valdés

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Electrical Conductivity ◽

Artificial Neural Networks ◽

High Density Polyethylene ◽

Carbon Particle ◽

High Density ◽

Simultaneous Optimization ◽

Multi Objective ◽

Multi Objective Genetic Algorithm

Download Full-text

Multi-Component Topology Optimization for Powder Bed Additive Manufacturing (MTO-A)

Volume 1A: 38th Computers and Information in Engineering Conference ◽

10.1115/detc2018-86284 ◽

2018 ◽

Cited By ~ 3

Author(s):

Yuqing Zhou ◽

Tsuyoshi Nomura ◽

Kazuhiro Saitou

Keyword(s):

Topology Optimization ◽

Additive Manufacturing ◽

Simultaneous Optimization ◽

Previous Attempt ◽

Numerical Instability ◽

Feature Size ◽

Powder Bed ◽

Nonlinear Projection ◽

Gradient Based ◽

Length Width

This paper presents a gradient-based multi-component topology optimization (MTO) method for structures assembled from components made by powder bed additive manufacturing. It is built upon our previous work on the continuously-relaxed MTO framework utilizing the concept of fractional component membership. The previous attempt on the integration of the relaxed MTO framework with additive manufacturing constraints, however, suffered from numerical instability for larger size problems, limiting its application to 2D low-resolution examples. To overcome this difficulty, this paper proposes an improved MTO formulation based on a design field regularization and a nonlinear projection of component membership variables, with a focus on powder bed additive manufacturing. For each component, constraints on the maximum allowable build volume (i.e., length, width, and height), the elimination of enclosed voids, and the minimum printable feature size are imposed during the simultaneous optimization of the overall base topology and component partitioning. The scalability of the new MTO formulation is demonstrated by a few 2D examples with much higher resolution than previously reported, and the first reported 3D example of MTO.

Download Full-text

True Gradient-Based Training of Deep Binary Activated Neural Networks Via Continuous Binarization

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8461456 ◽

2018 ◽

Cited By ~ 3

Author(s):

Charbel Sakr ◽

Jungwook Choi ◽

Zhuo Wang ◽

Kailash Gopalakrishnan ◽

Naresh Shanbhag

Keyword(s):

Neural Networks ◽

Gradient Based

Download Full-text

The Opportunities and Limitations of Using Artificial Neural Networks in Social Science Research

Politologija ◽

10.15388/polit.2019.94.2 ◽

2019 ◽

Vol 94 (2) ◽

pp. 56-80

Author(s):

Lukas Pukelis ◽

Vilius Stančiauskas

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Social Science ◽

Large Scale ◽

Social Science Research ◽

Complex Data ◽

Social Science Community ◽

The Social ◽

Science Community ◽

Artificial Neural

Artificial Neural Networks (ANNs) are being increasingly used in various disciplines outside computer science, such as bibliometrics, linguistics, and medicine. However, their uptake in the social science community has been relatively slow, because these highly non-linear models are difficult to interpret and cannot be used for hypothesis testing. Despite the existing limitations, this paper argues that the social science community can benefit from using ANNs in a number of ways, especially by outsourcing laborious data coding and pre-processing tasks to machines in the early stages of analysis. Using ANNs would enable small teams of researchers to process larger quantities of data and undertake more ambitious projects. In fact, the complexity of the pre-processing tasks that ANNs are able to perform mean that researchers could obtain rich and complex data typically associated with qualitative research at a large scale, allowing to combine the best from both qualitative and quantitative approaches.

Download Full-text

Differentiable probabilistic programming for strong gravitational lensing

10.22323/1.358.0515 ◽

2019 ◽

Author(s):

Marco Chianese

Keyword(s):

Gravitational Lensing ◽

Probabilistic Programming ◽

Strong Gravitational Lensing

Download Full-text