Angular Clustering with Photometric Redshifts in the Sloan Digital Sky Survey: Bimodality in the Clustering Properties of Galaxies

AbstractThis work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or ‘photo-z’ problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

Download Full-text

Deep learning approach for classifying, detecting and predicting photometric redshifts of quasars in the Sloan Digital Sky Survey stripe 82

Astronomy and Astrophysics ◽

10.1051/0004-6361/201731106 ◽

2018 ◽

Vol 611 ◽

pp. A97 ◽

Cited By ~ 11

Author(s):

J. Pasquet-Itam ◽

J. Pasquet

Keyword(s):

Random Forest ◽

Random Forest Classifier ◽

Sloan Digital Sky Survey ◽

Light Curves ◽

Support Vector ◽

Learning Approach ◽

Photometric Redshifts ◽

K Nearest Neighbors ◽

Sky Survey ◽

Extraction Step

We have applied a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90. For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a Gaussian process classifier). Indeed, the accuracy of the CNN within |Δz| < 0.1 can reach 78.09%, within |Δz| < 0.2 reaches 86.15%, within |Δz| < 0.3 reaches 91.2% and the value of root mean square (rms) is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz| < 0.1, |Δz| < 0.2, and |Δz| < 0.3 is 73.72%, 82.46%, and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases such as the Large Synoptic Survey Telescope.

Download Full-text

Morpho-photometric redshifts

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz2477 ◽

2019 ◽

Vol 489 (4) ◽

pp. 4802-4808 ◽

Cited By ~ 2

Author(s):

Kristen Menou

Keyword(s):

Poor Performance ◽

Sloan Digital Sky Survey ◽

Gradient Boosting ◽

Learning Tools ◽

Multi Layer Perceptron ◽

Comparable Data ◽

Photometric Redshifts ◽

Data Set ◽

Photometric Redshift ◽

Sky Survey

ABSTRACT Machine learning (ML) is one of two standard approaches (together with SED fitting) for estimating the redshifts of galaxies when only photometric information is available. ML photo-z solutions have traditionally ignored the morphological information available in galaxy images or partly included it in the form of hand-crafted features, with mixed results. We train a morphology-aware photometric redshift machine using modern deep learning tools. It uses a custom architecture that jointly trains on galaxy fluxes, colours, and images. Galaxy-integrated quantities are fed to a Multi-Layer Perceptron (MLP) branch, while images are fed to a convolutional (convnet) branch that can learn relevant morphological features. This split MLP-convnet architecture, which aims to disentangle strong photometric features from comparatively weak morphological ones, proves important for strong performance: a regular convnet-only architecture, while exposed to all available photometric information in images, delivers comparatively poor performance. We present a cross-validated MLP-convnet model trained on 130 000 SDSS-DR12 (Sloan Digital Sky Survey – Data Release 12) galaxies that outperforms a hyperoptimized Gradient Boosting solution (hyperopt+XGBoost), as well as the equivalent MLP-only architecture, on the redshift bias metric. The fourfold cross-validated MLP-convnet model achieves a bias δz/(1 + z) = −0.70 ± 1 × 10−3, approaching the performance of a reference ANNZ2 ensemble of 100 distinct models trained on a comparable data set. The relative performance of the morphology-aware and morphology-blind models indicates that galaxy morphology does improve ML-based photometric redshift estimation.

Download Full-text

Compact galaxies and the size–mass galaxy distribution from a colour-selected sample at 0.04 < z < 0.15 supplemented by ugrizYJHK photometric redshifts

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3327 ◽

2020 ◽

Vol 500 (2) ◽

pp. 1557-1574

Author(s):

Ivan K Baldry ◽

Tricia Sullivan ◽

Raffaele Rani ◽

Sebastian Turner

Keyword(s):

Galaxy Evolution ◽

High Redshift ◽

Sample Selection ◽

Distribution Functions ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Galaxy Distribution ◽

Sky Survey ◽

Galaxy Sample ◽

Compact Galaxies

ABSTRACT The size–mass galaxy distribution is a key diagnostic for galaxy evolution. Massive compact galaxies are potential surviving relics of a high-redshift phase of star formation. Some of these could be nearly unresolved in Sloan Digital Sky Survey (SDSS) imaging and thus not included in galaxy samples. To overcome this, a sample was selected from the combination of SDSS and UKIRT Infrared Deep Sky Survey (UKIDSS) photometry to r < 17.8. This was done using colour–colour selection, and then by obtaining accurate photometric redshifts (photo-z) using scaled flux matching (SFM). Compared to spectroscopic redshifts (spec-z), SFM obtained a 1σ scatter of 0.0125 with only 0.3 per cent outliers (|Δln (1 + z)| > 0.06). A sample of 163 186 galaxies was obtained with 0.04 < z < 0.15 over $2300\, {\rm deg}^2$ using a combination of spec-z and photo-z. Following Barro et al. log Σ1.5 = log M* − 1.5log r50, maj was used to define compactness. The spectroscopic completeness was 76 per cent for compact galaxies (log Σ1.5 > 10.5) compared to 92 per cent for normal-sized galaxies. This difference is primarily attributed to SDSS ‘fibre collisions’ and not the completeness of the main galaxy sample selection. Using environmental overdensities, this confirms that compact quiescent galaxies are significantly more likely to be found in high-density environments compared to normal-sized galaxies. By comparison with a high-redshift sample from 3D-HST, log Σ1.5 distribution functions show significant evolution, with this being a compelling way to compare with simulations such as EAGLE. The number density of compact quiescent galaxies drops by a factor of about 30 from z ∼ 2 to log (n/Mpc−3) = − 5.3 ± 0.4 in the SDSS–UKIDSS sample. The uncertainty is dominated by the steep cut off in log Σ1.5, which is demonstrated conclusively using this complete sample.

Download Full-text

Empirical photometric redshifts of luminous red galaxies and clusters in the Sloan Digital Sky Survey

Monthly Notices of the Royal Astronomical Society ◽

10.1111/j.1365-2966.2007.12203.x ◽

2007 ◽

Vol 380 (4) ◽

pp. 1608-1620 ◽

Cited By ~ 25

Author(s):

P. A. A. Lopes

Keyword(s):

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Luminous Red Galaxies ◽

Sky Survey ◽

Red Galaxies

Download Full-text

Photometric redshifts from SDSS images using a convolutional neural network

Astronomy and Astrophysics ◽

10.1051/0004-6361/201833617 ◽

2018 ◽

Vol 621 ◽

pp. A26 ◽

Cited By ~ 29

Author(s):

Johanna Pasquet ◽

E. Bertin ◽

M. Treyer ◽

S. Arnouts ◽

D. Fouchez

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Signal To Noise Ratio ◽

Distribution Functions ◽

Sloan Digital Sky Survey ◽

Photometric Redshifts ◽

Photometric Redshift ◽

Probability Distribution Functions ◽

Sky Survey ◽

Galaxy Sample

We developed a deep convolutional neural network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z < 0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64 × 64 pixel ugriz images centered on the spectroscopic targets, plus the galactic reddening value on the line-of-sight. For training sets of 100k objects or more (≥20% of the database), we reach a dispersion σMAD < 0.01, significantly lower than the current best one obtained from another machine learning technique on the same sample. The bias is lower than 10−4, independent of photometric redshift. The PDFs are shown to have very good predictive power. We also find that the CNN redshifts are unbiased with respect to galaxy inclination, and that σMAD decreases with the signal-to-noise ratio (S/N), achieving values below 0.007 for S/N > 100, as in the deep stacked region of Stripe 82. We argue that for most galaxies the precision is limited by the S/N of SDSS images rather than by the method. The success of this experiment at low redshift opens promising perspectives for upcoming surveys.

Download Full-text