scholarly journals SpArcFiRe: Enhancing Spiral Galaxy Recognition Using Arm Analysis and Random Forests

Galaxies ◽  
2018 ◽  
Vol 6 (3) ◽  
pp. 95 ◽  
Author(s):  
Pedro Silva ◽  
Leon Cao ◽  
Wayne Hayes

Automated quantification of galaxy morphology is necessary because the size of upcoming sky surveys will overwhelm human volunteers. Existing classification schemes are inadequate because (a) their uncertainty increases near the boundary of classes and astronomers need more control over these uncertainties; (b) galaxy morphology is continuous rather than discrete; and (c) sometimes we need to know not only the type of an object, but whether a particular image of the object exhibits visible structure. We propose that regression is better suited to these tasks than classification, and focus specifically on determining the extent to which an image of a spiral galaxy exhibits visible spiral structure. We use the human vote distributions from Galaxy Zoo 1 (GZ1) to train a random forest of decision trees to reproduce the fraction of GZ1 humans who vote for the “Spiral” class. We prefer the random forest model over other black box models like neural networks because it allows us to trace post hoc the precise reasoning behind the regression of each image. Finally, we demonstrate that using features from SpArcFiRe—a code designed to isolate and quantify arm structure in spiral galaxies—improves regression results over and above using traditional features alone, across a sample of 470,000 galaxies from the Sloan Digital Sky Survey.

Author(s):  
Pedro Silva ◽  
Leon T. Cao ◽  
Wayne B. Hayes

Automated machine classifications of galaxies are necessary because the size of upcoming surveys will overwhelm human volunteers. We improve upon existing machine classification methods by adding the output of SpArcFiRe to the inputs of a machine learning model. We use the human classifications from Galaxy Zoo 1 (GZ1) to train a random forest of decision trees to reproduce the human vote distributions of the Spiral class. We prefer the random forest model over other black box models like neural networks because it allows us to trace post hoc the precise reasoning behind the classification of each galaxy. We find that, across a sample of 470,000 Sloan galaxies that are large enough that details could be seen if they were there, the combination of SpArcFiRe outputs with existing SDSS features provides a better machine classification than either one alone on comparison to Galaxy Zoo 1. We suggest that adding SpArcFiRe outputs as features to any machine learning algorithm will likely improve its performance.


Author(s):  
Pedro Silva ◽  
Leon T. Cao ◽  
Wayne B. Hayes

Automated machine classifications of galaxies are necessary because the size of upcoming surveys will overwhelm human volunteers. We improve upon existing machine classification methods by adding the output of SpArcFiRe to the inputs of a machine learning model. We use the human classifications from Galaxy Zoo 1 (GZ1) to train a random forest of decision trees to reproduce the human vote distributions of the Spiral class. We prefer the random forest model over other black box models like neural networks because it allows us to trace post hoc the precise reasoning behind the classification of each galaxy. We find that, across a sample of 470,000 Sloan galaxies that are large enough that details could be seen if they were there, the combination of SpArcFiRe outputs with existing SDSS features provides a better machine classification than either one alone on comparison to Galaxy Zoo 1. We suggest that adding SpArcFiRe outputs as features to any machine learning algorithm will likely improve its performance.


2018 ◽  
Vol 15 (3) ◽  
pp. 314-323
Author(s):  
Baghdad Science Journal

Two galaxies have been chosen, spiral galaxy NGC 5005 and elliptical galaxy NGC 4278 to study their photometric properties by using surface photometric techniques with griz-Filters. Observations are obtained from the Sloan Digital Sky Survey (SDSS). The data reduction of all images have done, like bias and flat field, by SDSS pipeline. The overall structure of the two galaxies (a bulge, a disk), together with isophotal contour maps, surface brightness profiles and a bulge/disk decomposition of the galaxy images were performed, although the disk position angle, ellipticity and inclination of the galaxies have been estimated.


2018 ◽  
Vol 611 ◽  
pp. A97 ◽  
Author(s):  
J. Pasquet-Itam ◽  
J. Pasquet

We have applied a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90. For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a Gaussian process classifier). Indeed, the accuracy of the CNN within |Δz| < 0.1 can reach 78.09%, within |Δz| < 0.2 reaches 86.15%, within |Δz| < 0.3 reaches 91.2% and the value of root mean square (rms) is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz| < 0.1, |Δz| < 0.2, and |Δz| < 0.3 is 73.72%, 82.46%, and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases such as the Large Synoptic Survey Telescope.


2003 ◽  
Vol 588 (1) ◽  
pp. 218-229 ◽  
Author(s):  
Roberto G. Abraham ◽  
Sidney van den Bergh ◽  
Preethi Nair

2020 ◽  
Vol 498 (2) ◽  
pp. 1951-1962
Author(s):  
Michele Fumagalli ◽  
Sotiria Fotopoulou ◽  
Laura Thomson

ABSTRACT We present a pipeline based on a random forest classifier for the identification of high column density clouds of neutral hydrogen (i.e. the Lyman limit systems, LLSs) in absorption within large spectroscopic surveys of z ≳ 3 quasars. We test the performance of this method on mock quasar spectra that reproduce the expected data quality of the Dark Energy Spectroscopic Instrument and the WHT (William Herschel Telescope) Enhanced Area Velocity Explorer surveys, finding ${\gtrsim}90{{\ \rm per\ cent}}$ completeness and purity for $N_{\rm H\,\rm{\small I}} \gtrsim 10^{17.2}~\rm cm^{-2}$ LLSs against quasars of g &lt; 23 mag at z ≈ 3.5–3.7. After training and applying our method on 10 000 quasar spectra at z ≈ 3.5–4.0 from the Sloan Digital Sky Survey (Data Release 16), we identify ≈6600 LLSs with $N_{\rm H\,\rm{\small I}} \gtrsim 10^{17.5}~\rm cm^{-2}$ between z ≈ 3.1 and 4.0 with a completeness and purity of ${\gtrsim}90{{\ \rm per\ cent}}$ for the classification of LLSs. Using this sample, we measure a number of LLSs per unit redshift of ℓ(z) = 2.32 ± 0.08 at z = [3.3, 3.6]. We also present results on the performance of random forest for the measurement of the LLS redshifts and H i column densities, and for the identification of broad absorption line quasars.


Author(s):  
Pooja Thakkar

Abstract: The focus of this study is on drug categorization utilising Machine Learning models, as well as interpretability utilizing LIME and SHAP to get a thorough understanding of the ML models. To do this, the researchers used machine learning models such as random forest, decision tree, and logistic regression to classify drugs. Then, using LIME and SHAP, they determined if these models were interpretable, which allowed them to better understand their results. It may be stated at the conclusion of this paper that LIME and SHAP can be utilised to get insight into a Machine Learning model and determine which attribute is accountable for the divergence in the outcomes. According to the LIME and SHAP results, it is also discovered that Random Forest and Decision Tree ML models are the best models to employ for drug classification, with Na to K and BP being the most significant characteristics for drug classification. Keywords: Machine Learning, Back-box models, LIME, SHAP, Decision Tree


2021 ◽  
Vol 296 ◽  
pp. 103471
Author(s):  
Roberto Confalonieri ◽  
Tillman Weyde ◽  
Tarek R. Besold ◽  
Fermín Moscoso del Prado Martín

Sign in / Sign up

Export Citation Format

Share Document