scholarly journals Capturing the Physics of MaNGA Galaxies with Self-supervised Machine Learning

2021 ◽  
Vol 921 (2) ◽  
pp. 177
Author(s):  
Regina Sarmiento ◽  
Marc Huertas-Company ◽  
Johan H. Knapen ◽  
Sebastián F. Sánchez ◽  
Helena Domínguez Sánchez ◽  
...  

Abstract As available data sets grow in size and complexity, advanced visualization tools enabling their exploration and analysis become more important. In modern astronomy, integral field spectroscopic galaxy surveys are a clear example of increasing high dimensionality and complex data sets, which challenges the traditional methods used to extract the physical information they contain. We present the use of a novel self-supervised machine-learning method to visualize the multidimensional information on stellar population and kinematics in the MaNGA survey in a 2D plane. Our framework is insensitive to nonphysical properties such as the size of the integral field unit and is therefore able to order galaxies according to their resolved physical properties. Using the extracted representations, we study how galaxies distribute based on their resolved and global physical properties. We show that even when exclusively using information about the internal structure, galaxies naturally cluster into two well-known categories, rotating main-sequence disks and massive slow rotators, from a purely data-driven perspective, hence confirming distinct assembly channels. Low-mass rotation-dominated quenched galaxies appear as a third cluster only if information about the integrated physical properties is preserved, suggesting a mixture of assembly processes for these galaxies without any particular signature in their internal kinematics that distinguishes them from the two main groups. The framework for data exploration is publicly released with this publication, ready to be used with the MaNGA or other integral field data sets.

Author(s):  
Paul Rippon ◽  
Kerrie Mengersen

Learning algorithms are central to pattern recognition, artificial intelligence, machine learning, data mining, and statistical learning. The term often implies analysis of large and complex data sets with minimal human intervention. Bayesian learning has been variously described as a method of updating opinion based on new experience, updating parameters of a process model based on data, modelling and analysis of complex phenomena using multiple sources of information, posterior probabilistic expectation, and so on. In all of these guises, it has exploded in popularity over recent years.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Patricio Lagos ◽  
Polychronis Papaderos

We review the results from our studies, and previous published work, on the spatially resolved physical properties of a sample of Hii/BCD galaxies, as obtained mainly from integral-field unit spectroscopy with Gemini/GMOS and VLT/VIMOS. We confirm that, within observational uncertainties, our sample galaxies show nearly spatially constant chemical abundances similar to other low-mass starburst galaxies. They also show Heii  λ4686 emission with the properties being suggestive of a mix of excitation sources and with Wolf-Rayet stars being excluded as the primary ones. Finally, in this contribution, we include a list of all Hii/BCD galaxies studied thus far with integral-field unit spectroscopy.


2016 ◽  
Vol 35 (10) ◽  
pp. 906-909 ◽  
Author(s):  
Brendon Hall

There has been much excitement recently about big data and the dire need for data scientists who possess the ability to extract meaning from it. Geoscientists, meanwhile, have been doing science with voluminous data for years, without needing to brag about how big it is. But now that large, complex data sets are widely available, there has been a proliferation of tools and techniques for analyzing them. Many free and open-source packages now exist that provide powerful additions to the geoscientist's toolbox, much of which used to be only available in proprietary (and expensive) software platforms.


Author(s):  
Paul Rippon ◽  
Kerrie Mengersen

Learning algorithms are central to pattern recognition, artificial intelligence, machine learning, data mining, and statistical learning. The term often implies analysis of large and complex data sets with minimal human intervention. Bayesian learning has been variously described as a method of updating opinion based on new experience, updating parameters of a process model based on data, modelling and analysis of complex phenomena using multiple sources of information, posterior probabilistic expectation, and so on. In all of these guises, it has exploded in popularity over recent years.


2018 ◽  
Vol 7 (1.7) ◽  
pp. 201
Author(s):  
K Jayanthi ◽  
C Mahesh

Machine learning enables computers to help humans in analysing knowledge from large, complex data sets. One of the complex data is genetics and genomic data which needs to analyse various set of functions automatically by the computers. Hope this machine learning methods can provide more useful for making these data for further usage like gene prediction, gene expression, gene ontology, gene finding, gene editing and etc. The purpose of this study is to explore some machine learning applications and algorithms to genetic and genomic data. At the end of this study we conclude the following topics classifications of machine learning problems: supervised, unsupervised and semi supervised, which type of method is suitable for various problems in genomics, applications of machine learning and future views of machine learning in genomics.


2020 ◽  
Vol 18 (3) ◽  
pp. 507-527
Author(s):  
M. Ghorbani ◽  
S. Swift ◽  
S. J. E. Taylor ◽  
A. M. Payne

Abstract The generation of a feature matrix is the first step in conducting machine learning analyses on complex data sets such as those containing DNA, RNA or protein sequences. These matrices contain information for each object which have to be identified using complex algorithms to interrogate the data. They are normally generated by combining the results of running such algorithms across various datasets from different and distributed data sources. Thus for non-computing experts the generation of such matrices prove a barrier to employing machine learning techniques. Further since datasets are becoming larger this barrier is augmented by the limitations of the single personal computer most often used by investigators to carry out such analyses. Here we propose a user friendly system to generate feature matrices in a way that is flexible, scalable and extendable. Additionally by making use of The Berkeley Open Infrastructure for Network Computing (BOINC) software, the process can be speeded up using distributed volunteer computing possible in most institutions. The system makes use of a combination of the Grid and Cloud User Support Environment (gUSE), combined with the Web Services Parallel Grid Runtime and Developer Environment Portal (WS-PGRADE) to create workflow-based science gateways that allow users to submit work to the distributed computing. This report demonstrates the use of our proposed WS-PGRADE/gUSE BOINC system to identify features to populate matrices from very large DNA sequence data repositories, however we propose that this system could be used to analyse a wide variety of feature sets including image, numerical and text data.


2020 ◽  
Vol 25 (5) ◽  
pp. 379-390 ◽  
Author(s):  
Adam J. Russak ◽  
Farhan Chaudhry ◽  
Jessica K. De Freitas ◽  
Garrett Baron ◽  
Fayzan F. Chaudhry ◽  
...  

Despite substantial advances in the study, treatment, and prevention of cardiovascular disease, numerous challenges relating to optimally screening, diagnosing, and managing patients remain. Simultaneous improvements in computing power, data storage, and data analytics have led to the development of new techniques to address these challenges. One powerful tool to this end is machine learning (ML), which aims to algorithmically identify and represent structure within data. Machine learning’s ability to efficiently analyze large and highly complex data sets make it a desirable investigative approach in modern biomedical research. Despite this potential and enormous public and private sector investment, few prospective studies have demonstrated improved clinical outcomes from this technology. This is particularly true in cardiology, despite its emphasis on objective, data-driven results. This threatens to stifle ML’s growth and use in mainstream medicine. We outline the current state of ML in cardiology and outline methods through which impactful and sustainable ML research can occur. Following these steps can ensure ML reaches its potential as a transformative technology in medicine.


2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Omar Isaac Asensio ◽  
Ximin Mi ◽  
Sameer Dharur

For a growing class of prediction problems, big data and machine learning (ML) analyses can greatly enhance our understanding of the effectiveness of public investments and public policy. However, the outputs of many ML models are often abstract and inaccessible to policy communities or the general public. In this article, we describe a hands-on teaching case that is suitable for use in a graduate or advanced undergraduate public policy, public affairs, or environmental studies classroom. Students will engage on the use of increasingly popular ML classification algorithms and cloud-based data visualization tools to support policy and planning on the theme of electric vehicle mobility and connected infrastructure. By using these tools, students will critically evaluate and convert large and complex data sets into human understandable visualization for communication and decision making. The tools also enable user flexibility to engage with streaming data sources in a new creative design with little technical background.


Sign in / Sign up

Export Citation Format

Share Document