Photometric redshifts from SDSS images using a convolutional neural network

We developed a deep convolutional neural network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z < 0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64 × 64 pixel ugriz images centered on the spectroscopic targets, plus the galactic reddening value on the line-of-sight. For training sets of 100k objects or more (≥20% of the database), we reach a dispersion σMAD < 0.01, significantly lower than the current best one obtained from another machine learning technique on the same sample. The bias is lower than 10−4, independent of photometric redshift. The PDFs are shown to have very good predictive power. We also find that the CNN redshifts are unbiased with respect to galaxy inclination, and that σMAD decreases with the signal-to-noise ratio (S/N), achieving values below 0.007 for S/N > 100, as in the deep stacked region of Stripe 82. We argue that for most galaxies the precision is limited by the S/N of SDSS images rather than by the method. The success of this experiment at low redshift opens promising perspectives for upcoming surveys.

Download Full-text

Compact galaxies and the size–mass galaxy distribution from a colour-selected sample at 0.04 < z < 0.15 supplemented by ugrizYJHK photometric redshifts

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3327 ◽

2020 ◽

Vol 500 (2) ◽

pp. 1557-1574

Author(s):

Ivan K Baldry ◽

Tricia Sullivan ◽

Raffaele Rani ◽

Sebastian Turner

Keyword(s):

Galaxy Evolution ◽

High Redshift ◽

Sample Selection ◽

Distribution Functions ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Galaxy Distribution ◽

Sky Survey ◽

Galaxy Sample ◽

Compact Galaxies

ABSTRACT The size–mass galaxy distribution is a key diagnostic for galaxy evolution. Massive compact galaxies are potential surviving relics of a high-redshift phase of star formation. Some of these could be nearly unresolved in Sloan Digital Sky Survey (SDSS) imaging and thus not included in galaxy samples. To overcome this, a sample was selected from the combination of SDSS and UKIRT Infrared Deep Sky Survey (UKIDSS) photometry to r < 17.8. This was done using colour–colour selection, and then by obtaining accurate photometric redshifts (photo-z) using scaled flux matching (SFM). Compared to spectroscopic redshifts (spec-z), SFM obtained a 1σ scatter of 0.0125 with only 0.3 per cent outliers (|Δln (1 + z)| > 0.06). A sample of 163 186 galaxies was obtained with 0.04 < z < 0.15 over $2300\, {\rm deg}^2$ using a combination of spec-z and photo-z. Following Barro et al. log Σ1.5 = log M* − 1.5log r50, maj was used to define compactness. The spectroscopic completeness was 76 per cent for compact galaxies (log Σ1.5 > 10.5) compared to 92 per cent for normal-sized galaxies. This difference is primarily attributed to SDSS ‘fibre collisions’ and not the completeness of the main galaxy sample selection. Using environmental overdensities, this confirms that compact quiescent galaxies are significantly more likely to be found in high-density environments compared to normal-sized galaxies. By comparison with a high-redshift sample from 3D-HST, log Σ1.5 distribution functions show significant evolution, with this being a compelling way to compare with simulations such as EAGLE. The number density of compact quiescent galaxies drops by a factor of about 30 from z ∼ 2 to log (n/Mpc−3) = − 5.3 ± 0.4 in the SDSS–UKIDSS sample. The uncertainty is dominated by the steep cut off in log Σ1.5, which is demonstrated conclusively using this complete sample.

Download Full-text

PhotoWeb redshift: boosting photometric redshift accuracy with large spectroscopic surveys

Astronomy and Astrophysics ◽

10.1051/0004-6361/201937382 ◽

2020 ◽

Vol 636 ◽

pp. A90 ◽

Cited By ~ 1

Author(s):

M. Shuntov ◽

J. Pasquet ◽

S. Arnouts ◽

O. Ilbert ◽

M. Treyer ◽

...

Keyword(s):

Large Scale ◽

Distribution Functions ◽

Distance Measurements ◽

Photometric Redshifts ◽

Star Forming ◽

Photometric Redshift ◽

Probability Distribution Functions ◽

Galaxy Sample ◽

New Generation ◽

Better Than

Improving distance measurements in large imaging surveys is a major challenge to better reveal the distribution of galaxies on a large scale and to link galaxy properties with their environments. As recently shown, photometric redshifts can be efficiently combined with the cosmic web extracted from overlapping spectroscopic surveys to improve their accuracy. In this paper we apply a similar method using a new generation of photometric redshifts based on a convolution neural network (CNN). The CNN is trained on the SDSS images with the main galaxy sample (SDSS-MGS, r ≤ 17.8) and the GAMA spectroscopic redshifts up to r ∼ 19.8. The mapping of the cosmic web is obtained with 680 000 spectroscopic redshifts from the MGS and BOSS surveys. The redshift probability distribution functions (PDF), which are well calibrated (unbiased and narrow, ≤120 Mpc), intercept a few cosmic web structures along the line of sight. Combining these PDFs with the density field distribution provides new photometric redshifts, zweb, whose accuracy is improved by a factor of two (i.e., σ ∼ 0.004(1 + z)) for galaxies with r ≤ 17.8. For half of them, the distance accuracy is better than 10 cMpc. The narrower the original PDF, the larger the boost in accuracy. No gain is observed for original PDFs wider than 0.03. The final zweb PDFs also appear well calibrated. The method performs slightly better for passive galaxies than star-forming ones, and for galaxies in massive groups since these populations better trace the underlying large-scale structure. Reducing the spectroscopic sampling by a factor of 8 still improves the photometric redshift accuracy by 25%. Finally, extending the method to galaxies fainter than the MGS limit still improves the redshift estimates for 70% of the galaxies, with a gain in accuracy of 20% at low z where the resolution of the cosmic web is the highest. As two competing factors contribute to the performance of the method, the photometric redshift accuracy and the resolution of the cosmic web, the benefit of combining cosmological imaging surveys with spectroscopic surveys at higher redshift remains to be evaluated.

Download Full-text

Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

Proceedings of the International Astronomical Union ◽

10.1017/s1743921317001569 ◽

2016 ◽

Vol 12 (S325) ◽

pp. 145-155

Author(s):

Fionn Murtagh

Keyword(s):

Information Structure ◽

Data Analytics ◽

A Priori ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Merging Galaxies ◽

Photometric Redshift ◽

Sky Survey ◽

Wide Range ◽

Regression Problems

AbstractThis work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or ‘photo-z’ problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

Download Full-text

Morpho-photometric redshifts

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz2477 ◽

2019 ◽

Vol 489 (4) ◽

pp. 4802-4808 ◽

Cited By ~ 2

Author(s):

Kristen Menou

Keyword(s):

Poor Performance ◽

Sloan Digital Sky Survey ◽

Gradient Boosting ◽

Learning Tools ◽

Multi Layer Perceptron ◽

Comparable Data ◽

Photometric Redshifts ◽

Data Set ◽

Photometric Redshift ◽

Sky Survey

ABSTRACT Machine learning (ML) is one of two standard approaches (together with SED fitting) for estimating the redshifts of galaxies when only photometric information is available. ML photo-z solutions have traditionally ignored the morphological information available in galaxy images or partly included it in the form of hand-crafted features, with mixed results. We train a morphology-aware photometric redshift machine using modern deep learning tools. It uses a custom architecture that jointly trains on galaxy fluxes, colours, and images. Galaxy-integrated quantities are fed to a Multi-Layer Perceptron (MLP) branch, while images are fed to a convolutional (convnet) branch that can learn relevant morphological features. This split MLP-convnet architecture, which aims to disentangle strong photometric features from comparatively weak morphological ones, proves important for strong performance: a regular convnet-only architecture, while exposed to all available photometric information in images, delivers comparatively poor performance. We present a cross-validated MLP-convnet model trained on 130 000 SDSS-DR12 (Sloan Digital Sky Survey – Data Release 12) galaxies that outperforms a hyperoptimized Gradient Boosting solution (hyperopt+XGBoost), as well as the equivalent MLP-only architecture, on the redshift bias metric. The fourfold cross-validated MLP-convnet model achieves a bias δz/(1 + z) = −0.70 ± 1 × 10−3, approaching the performance of a reference ANNZ2 ensemble of 100 distinct models trained on a comparable data set. The relative performance of the morphology-aware and morphology-blind models indicates that galaxy morphology does improve ML-based photometric redshift estimation.

Download Full-text

J-PLUS: On the identification of new cluster members in the double galaxy cluster A2589 and A2593 using PDFs

Astronomy and Astrophysics ◽

10.1051/0004-6361/201731348 ◽

2019 ◽

Vol 622 ◽

pp. A178 ◽

Cited By ~ 10

Author(s):

A. Molino ◽

M. V. Costa-Duarte ◽

C. Mendes de Oliveira ◽

A. J. Cenarro ◽

G. B. Lima Neto ◽

...

Keyword(s):

Signal To Noise Ratio ◽

Galaxy Cluster ◽

Distribution Functions ◽

Nearby Galaxy ◽

Photometric Redshifts ◽

New Members ◽

Filamentary Structure ◽

Cluster Galaxies ◽

Wavelength Resolution ◽

Galaxy Sample

Aims. We aim to use multiband imaging from the Phase-3 Verification Data of the J-PLUS survey to derive accurate photometric redshifts (photo-z) and look for potential new members in the surroundings of the nearby galaxy clusters A2589 (z = 0.0414) & A2593 (z = 0.0440), using redshift probability distribution functions (PDFs). The ultimate goal is to demonstrate the usefulness of a 12-band filter system in the study of largescale structure in the local Universe. Methods. We present an optimized pipeline for the estimation of photometric redshifts in clusters of galaxies. This pipeline includes a PSF-corrected photometry, specific photometric apertures capable of enhancing the integrated signal in the bluest filters, a careful recalibration of the photometric uncertainties and accurate upper-limit estimations for faint detections. To foresee the expected precision of our photo-z beyond the spectroscopic sample, we designed a set of simulations in which real cluster galaxies are modeled and reinjected inside the images at different signal-to-noise ratio (S/N) levels, recomputing their photometry and photo-z estimates. Results. We tested our photo-z pipeline with a sample of 296 spectroscopically confirmed cluster members with an averaged magnitude of ⟨r⟩ = 16.6 and redshift ⟨z⟩ = 0.041. The combination of seven narrow and five broadband filters with a typical photometric-depth of r ~ 21.5 provides δz/(1 + z) = 0.01 photo-z estimates. A precision of δz/(1 + z) = 0.005 is obtained for the 177 galaxies brighter than magnitude r < 17. Based on simulations, a δz/(1 + z) = 0.02 and δz/(1 + z) = 0.03 is expected at magnitudes ⟨r⟩ = 18 and ⟨r⟩ = 22, respectively. Complementarily, we used SDSS/DR12 data to derive photo-z estimates for the same galaxy sample. This exercise demonstrates that the wavelength-resolution of the J-PLUS data can double the precision achieved by SDSS data for galaxies with a high S/N. Based on the Bayesian membership analysis carried out in this work, we find as much as 170 new candidates across the entire field (~5 deg2). The spatial distribution of these galaxies may suggest an overlap between the systems with no evidence of a clear filamentary structure connecting the clusters. This result is supported by X-ray Rosat All-Sky Survey observations suggesting that a hypothetical filament may have low density contrast on diffuse warm gas. Conclusions. We prove that the addition of the seven narrow-band filters make the J-PLUS data deeper in terms of photo-z-depth than other surveys of a similar photometric-depth but using only five broadbands. These preliminary results show the potential of J-PLUS data to revisit membership of groups and clusters from nearby galaxies, important for the determination of luminosity and mass functions and environmental studies at the intermediate and low-mass regime.

Download Full-text

Clustering and Halo Occupation Distribution of Active Galactic Nuclei

Proceedings of the International Astronomical Union ◽

10.1017/s1743921314003937 ◽

2013 ◽

Vol 9 (S304) ◽

pp. 243-243

Author(s):

Takamitsu Miyaji ◽

M. Krumpe ◽

A. Coil ◽

H. Aceves ◽

B. Husemann

Keyword(s):

Cross Correlation ◽

Dynamic Range ◽

Signal To Noise Ratio ◽

Sloan Digital Sky Survey ◽

Dark Matter Halo ◽

Bias Parameter ◽

X Ray ◽

Sky Survey ◽

Galaxy Population ◽

Halo Occupation Distribution

AbstractWe present the results of our series of studies on correlation function and halo occupation distribution of AGNs utilizing data the ROSAT All-Sky Survey (RASS) and the Sloan Digital Sky Survey (SDSS) in the redshift range of 0.07<z<0.36. In order to improve the signal-to-noise ratio, we take cross-correlation approach, where cross-correlation functions (CCF) between AGNs and much more numerous AGNs are analyzed. The calculated CCFs are analyzed using the Halo Occupation Distribution (HOD) model, where the CCFs are divided into the term contributed by the AGN-galaxy pairs that reside in one dark matter halo (DMH), (the 1-halo term) and those from two different DMHs (the 2-halo term). The 2-halo term is the indicator of the bias parameter, which is a function of the typical mass of the DMHs in which AGNs reside. The combination of the 1-halo and 2-halo terms gives, not only the typical DMH mass, but also how the AGNs are distributed among the DMHs as a function of mass separately for those at the center of the DMHs and satellites. The main results are as follows: (1) the range of typical mass of the DMHs in various sub-samples of AGNs log (MDMH/h−1MΘ) ~ 12.4–13.4, (2) we found a dependence of the AGN bias parameter on the X-ray luminosity of AGNs, while the optical luminosity dependence is not significant probably due to smaller dynamic range in luminosity for the optically-selected sample, and (3) the growth of the number of AGNs per DMH (N (MDMH)) with MDMH is shallow, or even may be flat, contrary to that of the galaxy population in general, which grows with MDMH proportionally, suggesting a suppression of AGN triggering in denser environment. In order to investigate the origin of the X-ray luminosity dependence, we are also investigating the dependence of clustering on the black hole mass and the Eddington ratio, we also present the results of this investigation.

Download Full-text

Producing a BOSS CMASS sample with DES imaging

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz2288 ◽

2019 ◽

Vol 489 (2) ◽

pp. 2887-2906 ◽

Cited By ~ 5

Author(s):

S Lee ◽

E M Huff ◽

A J Ross ◽

A Choi ◽

C Hirata ◽

...

Keyword(s):

Sloan Digital Sky Survey ◽

Joint Analysis ◽

Dark Energy Survey ◽

Sky Survey ◽

Angular Correlation Function ◽

Galaxy Sample ◽

Celestial Equator ◽

The Difference ◽

Galaxy Bias ◽

Validation Tests

ABSTRACT We present a sample of galaxies with the Dark Energy Survey (DES) photometry that replicates the properties of the BOSS CMASS sample. The CMASS galaxy sample has been well characterized by the Sloan Digital Sky Survey (SDSS) collaboration and was used to obtain the most powerful redshift-space galaxy clustering measurements to date. A joint analysis of redshift-space distortions (such as those probed by CMASS from SDSS) and a galaxy–galaxy lensing measurement for an equivalent sample from DES can provide powerful cosmological constraints. Unfortunately, the DES and SDSS-BOSS footprints have only minimal overlap, primarily on the celestial equator near the SDSS Stripe 82 region. Using this overlap, we build a robust Bayesian model to select CMASS-like galaxies in the remainder of the DES footprint. The newly defined DES-CMASS (DMASS) sample consists of 117 293 effective galaxies covering $1244\,\deg ^2$. Through various validation tests, we show that the DMASS sample selected by this model matches well with the BOSS CMASS sample, specifically in the South Galactic cap (SGC) region that includes Stripe 82. Combining measurements of the angular correlation function and the clustering-z distribution of DMASS, we constrain the difference in mean galaxy bias and mean redshift between the BOSS CMASS and DMASS samples to be $\Delta b = 0.010^{+0.045}_{-0.052}$ and $\Delta z = \left(3.46^{+5.48}_{-5.55} \right) \times 10^{-3}$ for the SGC portion of CMASS, and $\Delta b = 0.044^{+0.044}_{-0.043}$ and $\Delta z= (3.51^{+4.93}_{-5.91}) \times 10^{-3}$ for the full CMASS sample. These values indicate that the mean bias of galaxies and mean redshift in the DMASS sample are consistent with both CMASS samples within 1σ.

Download Full-text

Distribution of Maximal Luminosity of Galaxies in the Sloan Digital Sky Survey

Proceedings of the International Astronomical Union ◽

10.1017/s174392131401076x ◽

2014 ◽

Vol 10 (S306) ◽

pp. 351-354

Author(s):

E. Regős ◽

A. Szalay ◽

Z. Rácz ◽

M. Taghizadeh ◽

K. Ozogany

Keyword(s):

Limit Distribution ◽

Finite Size ◽

Gumbel Distribution ◽

Sloan Digital Sky Survey ◽

Extreme Value Statistics ◽

Independent Variables ◽

Sky Survey ◽

The Press ◽

Galaxy Sample ◽

Good Agreement

AbstractExtreme value statistics (EVS) is applied to the pixelized distribution of galaxy luminosities in the Sloan Digital Sky Survey (SDSS). We analyze the DR8 Main Galaxy Sample (MGS) as well as the Luminous Red Galaxy Sample (LRGS). A non-parametric comparison of the EVS of the luminosities with the Fisher-Tippett-Gumbel distribution (limit distribution for independent variables distributed by the Press-Schechter law) indicates a good agreement provided uncertainties arising both from the finite size of the samples and from the sample size distribution are accounted for. This effectively rules out the possibility of having a finite maximum cutoff luminosity.

Download Full-text

DRPnet - Automated Particle Picking in Cryo-Electron Micrographs using Deep Regression

10.1101/616169 ◽

2019 ◽

Cited By ~ 1

Author(s):

Nguyen P. Nguyen ◽

Jacob Gotberg ◽

Ilker Ersoy ◽

Filiz Bunyak ◽

Tommi White

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Electron Micrographs ◽

Signal To Noise Ratio ◽

Particle Analysis ◽

User Interactions ◽

Distance Map ◽

Protein Particles ◽

Selection Of ◽

Specific Particle

AbstractSelection of individual protein particles in cryo-electron micrographs is an important step in single particle analysis. In this study, we developed a deep learning-based method to automatically detect particle centers from cryoEM micrographs. This is a challenging task because of the low signal-to-noise ratio of cryoEM micrographs and the size, shape, and grayscale-level variations in particles. We propose a double convolutional neural network (CNN) cascade for automated detection of particles in cryo-electron micrographs. Particles are detected by the first network, a fully convolutional regression network (FCRN), which maps the particle image to a continuous distance map that acts like a probability density function of particle centers. Particles identified by FCRN are further refined (or classified) to reduce false particle detections by the second CNN. This approach, entitled Deep Regression Picker Network or “DRPnet”, is simple but very effective in recognizing different grayscale patterns corresponding to 2D views of 3D particles. Our experiments showed that DRPnet’s first CNN pretrained with one dataset can be used to detect particles from a different datasets without retraining. The performance of this network can be further improved by re-training the network using specific particle datasets. The second network, a classification convolutional neural network, is used to refine detection results by identifying false detections. The proposed fully automated “deep regression” system, DRPnet, pretrained with TRPV1 (EMPIAR-10005) [1], and tested on β-galactosidase (EMPIAR-10017) [2] and β-galactosidase (EMPIAR-10061) [3], was then compared to RELION’s interactive particle picking. Preliminary experiments resulted in comparable or better particle picking performance with drastically reduced user interactions and improved processing time.

Download Full-text

The frequency of gaseous debris discs around white dwarfs

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa359 ◽

2020 ◽

Vol 493 (2) ◽

pp. 2127-2139 ◽

Cited By ~ 13

Author(s):

Christopher J Manser ◽

Boris T Gänsicke ◽

Nicola Pietro Gentile Fusillo ◽

Richard Ashley ◽

Elmé Breedt ◽

...

Keyword(s):

White Dwarf ◽

White Dwarfs ◽

Signal To Noise Ratio ◽

Infrared Emission ◽

Sloan Digital Sky Survey ◽

Emission Lines ◽

Occurrence Rate ◽

Limited Sample ◽

Sky Survey ◽

Gaseous Disc

ABSTRACT A total of 1–3 per cent of white dwarfs are orbited by planetary dusty debris detectable as infrared emission in excess above the white dwarf flux. In a rare subset of these systems, a gaseous disc component is also detected via emission lines of the Ca ii 8600 Å triplet, broadened by the Keplerian velocity of the disc. We present the first statistical study of the fraction of debris discs containing detectable amounts of gas in emission at white dwarfs within a magnitude and signal-to-noise ratio limited sample. We select 7705 single white dwarfs spectroscopically observed by the Sloan Digital Sky Survey (SDSS) and Gaia with magnitudes g ≤ 19. We identify five gaseous disc hosts, all of which have been previously discovered. We calculate the occurrence rate of a white dwarf hosting a debris disc detectable via Ca ii emission lines as $0.067\, \pm \, ^{0.042}_{0.025}$ per cent. This corresponds to an occurrence rate for a dusty debris disc to have an observable gaseous component in emission as 4 ± $_{2}^{4}$ per cent. Given that variability is a common feature of the emission profiles of gaseous debris discs, and the recent detection of a planetesimal orbiting within the disc of SDSS J122859.93+104032.9, we propose that gaseous components are tracers for the presence of planetesimals embedded in the discs and outline a qualitative model. We also present spectroscopy of the Ca ii triplet 8600 Å region for 20 white dwarfs hosting dusty debris discs in an attempt to identify gaseous emission. We do not detect any gaseous components in these 20 systems, consistent with the occurrence rate that we calculated.

Download Full-text