Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

Fionn Murtagh

doi:10.1017/s1743921317001569

Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

Proceedings of the International Astronomical Union ◽

10.1017/s1743921317001569 ◽

2016 ◽

Vol 12 (S325) ◽

pp. 145-155

Author(s):

Fionn Murtagh

Keyword(s):

Information Structure ◽

Data Analytics ◽

A Priori ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Merging Galaxies ◽

Photometric Redshift ◽

Sky Survey ◽

Wide Range ◽

Regression Problems

AbstractThis work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or ‘photo-z’ problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

Download Full-text

Morpho-photometric redshifts

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz2477 ◽

2019 ◽

Vol 489 (4) ◽

pp. 4802-4808 ◽

Cited By ~ 2

Author(s):

Kristen Menou

Keyword(s):

Poor Performance ◽

Sloan Digital Sky Survey ◽

Gradient Boosting ◽

Learning Tools ◽

Multi Layer Perceptron ◽

Comparable Data ◽

Photometric Redshifts ◽

Data Set ◽

Photometric Redshift ◽

Sky Survey

ABSTRACT Machine learning (ML) is one of two standard approaches (together with SED fitting) for estimating the redshifts of galaxies when only photometric information is available. ML photo-z solutions have traditionally ignored the morphological information available in galaxy images or partly included it in the form of hand-crafted features, with mixed results. We train a morphology-aware photometric redshift machine using modern deep learning tools. It uses a custom architecture that jointly trains on galaxy fluxes, colours, and images. Galaxy-integrated quantities are fed to a Multi-Layer Perceptron (MLP) branch, while images are fed to a convolutional (convnet) branch that can learn relevant morphological features. This split MLP-convnet architecture, which aims to disentangle strong photometric features from comparatively weak morphological ones, proves important for strong performance: a regular convnet-only architecture, while exposed to all available photometric information in images, delivers comparatively poor performance. We present a cross-validated MLP-convnet model trained on 130 000 SDSS-DR12 (Sloan Digital Sky Survey – Data Release 12) galaxies that outperforms a hyperoptimized Gradient Boosting solution (hyperopt+XGBoost), as well as the equivalent MLP-only architecture, on the redshift bias metric. The fourfold cross-validated MLP-convnet model achieves a bias δz/(1 + z) = −0.70 ± 1 × 10−3, approaching the performance of a reference ANNZ2 ensemble of 100 distinct models trained on a comparable data set. The relative performance of the morphology-aware and morphology-blind models indicates that galaxy morphology does improve ML-based photometric redshift estimation.

Download Full-text

Photometric redshifts from SDSS images using a convolutional neural network

Astronomy and Astrophysics ◽

10.1051/0004-6361/201833617 ◽

2018 ◽

Vol 621 ◽

pp. A26 ◽

Cited By ~ 29

Author(s):

Johanna Pasquet ◽

E. Bertin ◽

M. Treyer ◽

S. Arnouts ◽

D. Fouchez

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Signal To Noise Ratio ◽

Distribution Functions ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Photometric Redshift ◽

Probability Distribution Functions ◽

Sky Survey ◽

Galaxy Sample

We developed a deep convolutional neural network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z < 0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64 × 64 pixel ugriz images centered on the spectroscopic targets, plus the galactic reddening value on the line-of-sight. For training sets of 100k objects or more (≥20% of the database), we reach a dispersion σMAD < 0.01, significantly lower than the current best one obtained from another machine learning technique on the same sample. The bias is lower than 10−4, independent of photometric redshift. The PDFs are shown to have very good predictive power. We also find that the CNN redshifts are unbiased with respect to galaxy inclination, and that σMAD decreases with the signal-to-noise ratio (S/N), achieving values below 0.007 for S/N > 100, as in the deep stacked region of Stripe 82. We argue that for most galaxies the precision is limited by the S/N of SDSS images rather than by the method. The success of this experiment at low redshift opens promising perspectives for upcoming surveys.

Download Full-text

THE SLOAN DIGITAL SKY SURVEY CO-ADD: A GALAXY PHOTOMETRIC REDSHIFT CATALOG

The Astrophysical Journal ◽

10.1088/0004-637x/747/1/59 ◽

2012 ◽

Vol 747 (1) ◽

pp. 59 ◽

Cited By ~ 32

Author(s):

Ribamar R. R. Reis ◽

Marcelle Soares-Santos ◽

James Annis ◽

Scott Dodelson ◽

Jiangang Hao ◽

...

Keyword(s):

Sloan Digital Sky Survey ◽

Photometric Redshift ◽

Sky Survey

Download Full-text

Deep learning approach for classifying, detecting and predicting photometric redshifts of quasars in the Sloan Digital Sky Survey stripe 82

Astronomy and Astrophysics ◽

10.1051/0004-6361/201731106 ◽

2018 ◽

Vol 611 ◽

pp. A97 ◽

Cited By ~ 11

Author(s):

J. Pasquet-Itam ◽

J. Pasquet

Keyword(s):

Random Forest ◽

Random Forest Classifier ◽

Sloan Digital Sky Survey ◽

Light Curves ◽

Support Vector ◽

Learning Approach ◽

Photometric Redshifts ◽

K Nearest Neighbors ◽

Sky Survey ◽

Extraction Step

We have applied a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90. For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a Gaussian process classifier). Indeed, the accuracy of the CNN within |Δz| < 0.1 can reach 78.09%, within |Δz| < 0.2 reaches 86.15%, within |Δz| < 0.3 reaches 91.2% and the value of root mean square (rms) is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz| < 0.1, |Δz| < 0.2, and |Δz| < 0.3 is 73.72%, 82.46%, and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases such as the Large Synoptic Survey Telescope.

Download Full-text

NEW APPROACHES TO PHOTOMETRIC REDSHIFT PREDICTION VIA GAUSSIAN PROCESS REGRESSION IN THE SLOAN DIGITAL SKY SURVEY

The Astrophysical Journal ◽

10.1088/0004-637x/706/1/623 ◽

2009 ◽

Vol 706 (1) ◽

pp. 623-636 ◽

Cited By ~ 31

Author(s):

M. J. Way ◽

L. V. Foster ◽

P. R. Gazis ◽

A. N. Srivastava

Keyword(s):

Gaussian Process ◽

Gaussian Process Regression ◽

Sloan Digital Sky Survey ◽

Photometric Redshift ◽

New Approaches ◽

Sky Survey

Download Full-text

Angular Clustering with Photometric Redshifts in the Sloan Digital Sky Survey: Bimodality in the Clustering Properties of Galaxies

The Astrophysical Journal ◽

10.1086/377168 ◽

2003 ◽

Vol 595 (1) ◽

pp. 59-70 ◽

Cited By ~ 99

Author(s):

Tamas Budavari ◽

Andrew J. Connolly ◽

Alexander S. Szalay ◽

Istvan Szapudi ◽

Istvan Csabai ◽

...

Keyword(s):

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Sky Survey

Download Full-text

Compact galaxies and the size–mass galaxy distribution from a colour-selected sample at 0.04 < z < 0.15 supplemented by ugrizYJHK photometric redshifts

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3327 ◽

2020 ◽

Vol 500 (2) ◽

pp. 1557-1574

Author(s):

Ivan K Baldry ◽

Tricia Sullivan ◽

Raffaele Rani ◽

Sebastian Turner

Keyword(s):

Galaxy Evolution ◽

High Redshift ◽

Sample Selection ◽

Distribution Functions ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Galaxy Distribution ◽

Sky Survey ◽

Galaxy Sample ◽

Compact Galaxies

ABSTRACT The size–mass galaxy distribution is a key diagnostic for galaxy evolution. Massive compact galaxies are potential surviving relics of a high-redshift phase of star formation. Some of these could be nearly unresolved in Sloan Digital Sky Survey (SDSS) imaging and thus not included in galaxy samples. To overcome this, a sample was selected from the combination of SDSS and UKIRT Infrared Deep Sky Survey (UKIDSS) photometry to r < 17.8. This was done using colour–colour selection, and then by obtaining accurate photometric redshifts (photo-z) using scaled flux matching (SFM). Compared to spectroscopic redshifts (spec-z), SFM obtained a 1σ scatter of 0.0125 with only 0.3 per cent outliers (|Δln (1 + z)| > 0.06). A sample of 163 186 galaxies was obtained with 0.04 < z < 0.15 over $2300\, {\rm deg}^2$ using a combination of spec-z and photo-z. Following Barro et al. log Σ1.5 = log M* − 1.5log r50, maj was used to define compactness. The spectroscopic completeness was 76 per cent for compact galaxies (log Σ1.5 > 10.5) compared to 92 per cent for normal-sized galaxies. This difference is primarily attributed to SDSS ‘fibre collisions’ and not the completeness of the main galaxy sample selection. Using environmental overdensities, this confirms that compact quiescent galaxies are significantly more likely to be found in high-density environments compared to normal-sized galaxies. By comparison with a high-redshift sample from 3D-HST, log Σ1.5 distribution functions show significant evolution, with this being a compelling way to compare with simulations such as EAGLE. The number density of compact quiescent galaxies drops by a factor of about 30 from z ∼ 2 to log (n/Mpc−3) = − 5.3 ± 0.4 in the SDSS–UKIDSS sample. The uncertainty is dominated by the steep cut off in log Σ1.5, which is demonstrated conclusively using this complete sample.

Download Full-text

Empirical photometric redshifts of luminous red galaxies and clusters in the Sloan Digital Sky Survey

Monthly Notices of the Royal Astronomical Society ◽

10.1111/j.1365-2966.2007.12203.x ◽

2007 ◽

Vol 380 (4) ◽

pp. 1608-1620 ◽

Cited By ~ 25

Author(s):

P. A. A. Lopes

Keyword(s):

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Luminous Red Galaxies ◽

Sky Survey ◽

Red Galaxies

Download Full-text

Structural and stellar-population properties versus bulge types in Sloan Digital Sky Survey central galaxies

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa328 ◽

2020 ◽

Vol 493 (2) ◽

pp. 1686-1707 ◽

Cited By ~ 4

Author(s):

Yifei Luo ◽

S M Faber ◽

Aldo Rodríguez-Puebla ◽

Joanna Woo ◽

Yicheng Guo ◽

...

Keyword(s):

Star Formation ◽

Stellar Population ◽

Star Formation Rate ◽

Structural Parameters ◽

Formation Rate ◽

Sloan Digital Sky Survey ◽

Central Density ◽

Sky Survey ◽

Wide Range ◽

Non Linear

ABSTRACT This paper studies pseudo-bulges (P-bulges) and classical bulges (C-bulges) in Sloan Digital Sky Survey (SDSS) central galaxies using the new bulge indicator ΔΣ1, which measures relative central stellar-mass surface density within 1 kpc. We compare ΔΣ1 to the established bulge-type indicator Δ〈μe〉 from Gadotti (2009) and show that classifying by ΔΣ1 agrees well with Δ〈μe〉. ΔΣ1 requires no bulge–disc decomposition and can be measured on SDSS images out to z = 0.07. Bulge types using it are mapped on to 20 different structural and stellar-population properties for 12 000 SDSS central galaxies with masses 10.0 < log M*/M⊙ < 10.4. New trends emerge from this large sample. Structural parameters show fairly linear log–log relations versus ΔΣ1 and Δ〈μe〉 with only moderate scatter, while stellar-population parameters show a highly non-linear ‘elbow’ in which specific star formation rate remains roughly flat with increasing central density and then falls rapidly at the elbow, where galaxies begin to quench. P-bulges occupy the low-density end of the horizontal arm of the elbow and are universally star forming, while C-bulges occupy the elbow and the vertical branch and exhibit a wide range of star formation rates at a fixed density. The non-linear relation between central density and star formation rate has been seen before, but this mapping on to bulge class is new. The wide range of star formation rates in C-bulges helps to explain why bulge classifications using different parameters have sometimes disagreed in the past. The elbow-shaped relation between density and stellar indices suggests that central structure and stellar populations evolve at different rates as galaxies begin to quench.

Download Full-text

The Chemical Enrichment and Mass Assembly Histories of SDSS Galaxies

Proceedings of the International Astronomical Union ◽

10.1017/s1743921306006727 ◽

2006 ◽

Vol 2 (S235) ◽

pp. 307-307

Author(s):

R. Cid Fernandes ◽

N. V. Asari ◽

J. P. Torres-Papaqui ◽

W. Schoenell ◽

L. Sodré ◽

...

Keyword(s):

Galaxy Evolution ◽

Synthesis Method ◽

Sloan Digital Sky Survey ◽

Population Synthesis ◽

Star Forming ◽

Chemical Enrichment ◽

Sky Survey ◽

Wide Range ◽

Mass Assembly ◽

Evolution Systems

AbstractWe explore the mass-assembly and chemical enrichment histories of star forming galaxies by applying a population synthesis method to a sample of nearly 70k galaxies culled from over 500k galaxies from the Sloan Digital Sky Survey Data Release 5. Our method decomposes the entire observed spectrum in terms of a sum of simple stellar populations spanning a wide range of ages and metallicities, thus allowing the reconstruction of galaxy histories. A comparative study of galaxy evolution is presented, where galaxies are grouped onto bins of nebular abundances or mass. We find that galaxies whose warm interstellar medium is poor in heavy elements are slow in forming stars. Their stellar metallicities also rise slowly with time, reaching their current values (Z⋆ ~ 1/4Z⊙) in the last ~100 Myr of evolution. Systems with metal rich nebulae, on the other hand, assembled most of their mass and completed their chemical evolution long ago, reaching Z⋆ ~ Z⊙ already at lookback times of a few Gyr. These same trends, which are ultimately a consequence of galaxy downsizing, appear when galaxies are grouped according to their stellar mass. The reconstruction of galaxy histories to this level of detail out of integrated spectra offers promising prospects in the field of galaxy evolution theories.

Download Full-text