scholarly journals Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

2016 ◽  
Vol 12 (S325) ◽  
pp. 145-155
Author(s):  
Fionn Murtagh

AbstractThis work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or ‘photo-z’ problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

2019 ◽  
Vol 489 (4) ◽  
pp. 4802-4808 ◽  
Author(s):  
Kristen Menou

ABSTRACT Machine learning (ML) is one of two standard approaches (together with SED fitting) for estimating the redshifts of galaxies when only photometric information is available. ML photo-z solutions have traditionally ignored the morphological information available in galaxy images or partly included it in the form of hand-crafted features, with mixed results. We train a morphology-aware photometric redshift machine using modern deep learning tools. It uses a custom architecture that jointly trains on galaxy fluxes, colours, and images. Galaxy-integrated quantities are fed to a Multi-Layer Perceptron (MLP) branch, while images are fed to a convolutional (convnet) branch that can learn relevant morphological features. This split MLP-convnet architecture, which aims to disentangle strong photometric features from comparatively weak morphological ones, proves important for strong performance: a regular convnet-only architecture, while exposed to all available photometric information in images, delivers comparatively poor performance. We present a cross-validated MLP-convnet model trained on 130 000 SDSS-DR12 (Sloan Digital Sky Survey – Data Release 12) galaxies that outperforms a hyperoptimized Gradient Boosting solution (hyperopt+XGBoost), as well as the equivalent MLP-only architecture, on the redshift bias metric. The fourfold cross-validated MLP-convnet model achieves a bias δz/(1 + z) = −0.70 ± 1 × 10−3, approaching the performance of a reference ANNZ2 ensemble of 100 distinct models trained on a comparable data set. The relative performance of the morphology-aware and morphology-blind models indicates that galaxy morphology does improve ML-based photometric redshift estimation.


2018 ◽  
Vol 621 ◽  
pp. A26 ◽  
Author(s):  
Johanna Pasquet ◽  
E. Bertin ◽  
M. Treyer ◽  
S. Arnouts ◽  
D. Fouchez

We developed a deep convolutional neural network (CNN), used as a classifier, to estimate photometric redshifts and associated probability distribution functions (PDF) for galaxies in the Main Galaxy Sample of the Sloan Digital Sky Survey at z <  0.4. Our method exploits all the information present in the images without any feature extraction. The input data consist of 64 × 64 pixel ugriz images centered on the spectroscopic targets, plus the galactic reddening value on the line-of-sight. For training sets of 100k objects or more (≥20% of the database), we reach a dispersion σMAD <  0.01, significantly lower than the current best one obtained from another machine learning technique on the same sample. The bias is lower than 10−4, independent of photometric redshift. The PDFs are shown to have very good predictive power. We also find that the CNN redshifts are unbiased with respect to galaxy inclination, and that σMAD decreases with the signal-to-noise ratio (S/N), achieving values below 0.007 for S/N >  100, as in the deep stacked region of Stripe 82. We argue that for most galaxies the precision is limited by the S/N of SDSS images rather than by the method. The success of this experiment at low redshift opens promising perspectives for upcoming surveys.


2012 ◽  
Vol 747 (1) ◽  
pp. 59 ◽  
Author(s):  
Ribamar R. R. Reis ◽  
Marcelle Soares-Santos ◽  
James Annis ◽  
Scott Dodelson ◽  
Jiangang Hao ◽  
...  

2018 ◽  
Vol 611 ◽  
pp. A97 ◽  
Author(s):  
J. Pasquet-Itam ◽  
J. Pasquet

We have applied a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90. For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a Gaussian process classifier). Indeed, the accuracy of the CNN within |Δz| < 0.1 can reach 78.09%, within |Δz| < 0.2 reaches 86.15%, within |Δz| < 0.3 reaches 91.2% and the value of root mean square (rms) is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz| < 0.1, |Δz| < 0.2, and |Δz| < 0.3 is 73.72%, 82.46%, and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases such as the Large Synoptic Survey Telescope.


2003 ◽  
Vol 595 (1) ◽  
pp. 59-70 ◽  
Author(s):  
Tamas Budavari ◽  
Andrew J. Connolly ◽  
Alexander S. Szalay ◽  
Istvan Szapudi ◽  
Istvan Csabai ◽  
...  

2020 ◽  
Vol 500 (2) ◽  
pp. 1557-1574
Author(s):  
Ivan K Baldry ◽  
Tricia Sullivan ◽  
Raffaele Rani ◽  
Sebastian Turner

ABSTRACT The size–mass galaxy distribution is a key diagnostic for galaxy evolution. Massive compact galaxies are potential surviving relics of a high-redshift phase of star formation. Some of these could be nearly unresolved in Sloan Digital Sky Survey (SDSS) imaging and thus not included in galaxy samples. To overcome this, a sample was selected from the combination of SDSS and UKIRT Infrared Deep Sky Survey (UKIDSS) photometry to r &lt; 17.8. This was done using colour–colour selection, and then by obtaining accurate photometric redshifts (photo-z) using scaled flux matching (SFM). Compared to spectroscopic redshifts (spec-z), SFM obtained a 1σ scatter of 0.0125 with only 0.3 per cent outliers (|Δln (1 + z)| &gt; 0.06). A sample of 163 186 galaxies was obtained with 0.04 &lt; z &lt; 0.15 over $2300\, {\rm deg}^2$ using a combination of spec-z and photo-z. Following Barro et al. log Σ1.5 = log M* − 1.5log r50, maj was used to define compactness. The spectroscopic completeness was 76 per cent for compact galaxies (log Σ1.5 &gt; 10.5) compared to 92 per cent for normal-sized galaxies. This difference is primarily attributed to SDSS ‘fibre collisions’ and not the completeness of the main galaxy sample selection. Using environmental overdensities, this confirms that compact quiescent galaxies are significantly more likely to be found in high-density environments compared to normal-sized galaxies. By comparison with a high-redshift sample from 3D-HST, log Σ1.5 distribution functions show significant evolution, with this being a compelling way to compare with simulations such as EAGLE. The number density of compact quiescent galaxies drops by a factor of about 30 from z ∼ 2 to log (n/Mpc−3) = − 5.3 ± 0.4 in the SDSS–UKIDSS sample. The uncertainty is dominated by the steep cut off in log Σ1.5, which is demonstrated conclusively using this complete sample.


2020 ◽  
Vol 493 (2) ◽  
pp. 1686-1707 ◽  
Author(s):  
Yifei Luo ◽  
S M Faber ◽  
Aldo Rodríguez-Puebla ◽  
Joanna Woo ◽  
Yicheng Guo ◽  
...  

ABSTRACT This paper studies pseudo-bulges (P-bulges) and classical bulges (C-bulges) in Sloan Digital Sky Survey (SDSS) central galaxies using the new bulge indicator ΔΣ1, which measures relative central stellar-mass surface density within 1 kpc. We compare ΔΣ1 to the established bulge-type indicator Δ〈μe〉 from Gadotti (2009) and show that classifying by ΔΣ1 agrees well with Δ〈μe〉. ΔΣ1 requires no bulge–disc decomposition and can be measured on SDSS images out to z = 0.07. Bulge types using it are mapped on to 20 different structural and stellar-population properties for 12 000 SDSS central galaxies with masses 10.0 &lt; log M*/M⊙ &lt; 10.4. New trends emerge from this large sample. Structural parameters show fairly linear log–log relations versus ΔΣ1 and Δ〈μe〉 with only moderate scatter, while stellar-population parameters show a highly non-linear ‘elbow’ in which specific star formation rate remains roughly flat with increasing central density and then falls rapidly at the elbow, where galaxies begin to quench. P-bulges occupy the low-density end of the horizontal arm of the elbow and are universally star forming, while C-bulges occupy the elbow and the vertical branch and exhibit a wide range of star formation rates at a fixed density. The non-linear relation between central density and star formation rate has been seen before, but this mapping on to bulge class is new. The wide range of star formation rates in C-bulges helps to explain why bulge classifications using different parameters have sometimes disagreed in the past. The elbow-shaped relation between density and stellar indices suggests that central structure and stellar populations evolve at different rates as galaxies begin to quench.


2006 ◽  
Vol 2 (S235) ◽  
pp. 307-307
Author(s):  
R. Cid Fernandes ◽  
N. V. Asari ◽  
J. P. Torres-Papaqui ◽  
W. Schoenell ◽  
L. Sodré ◽  
...  

AbstractWe explore the mass-assembly and chemical enrichment histories of star forming galaxies by applying a population synthesis method to a sample of nearly 70k galaxies culled from over 500k galaxies from the Sloan Digital Sky Survey Data Release 5. Our method decomposes the entire observed spectrum in terms of a sum of simple stellar populations spanning a wide range of ages and metallicities, thus allowing the reconstruction of galaxy histories. A comparative study of galaxy evolution is presented, where galaxies are grouped onto bins of nebular abundances or mass. We find that galaxies whose warm interstellar medium is poor in heavy elements are slow in forming stars. Their stellar metallicities also rise slowly with time, reaching their current values (Z⋆ ~ 1/4Z⊙) in the last ~100 Myr of evolution. Systems with metal rich nebulae, on the other hand, assembled most of their mass and completed their chemical evolution long ago, reaching Z⋆ ~ Z⊙ already at lookback times of a few Gyr. These same trends, which are ultimately a consequence of galaxy downsizing, appear when galaxies are grouped according to their stellar mass. The reconstruction of galaxy histories to this level of detail out of integrated spectra offers promising prospects in the field of galaxy evolution theories.


Sign in / Sign up

Export Citation Format

Share Document