scholarly journals Big Data in Astronomy: Surveys, Catalogs, Databases and Archives

Author(s):  
A. M. Mickaelian

We present the modern situation in astronomy, where Big Data coming from the Universe put new tasks for catalogizing, storage, archiving, analysis and usage of the scientific information. The two major characteristics of modern astronomy are multiwavelength (MW) studies (from γ-ray to radio, as well as multi-messenger studies, using also neutrinos, gravitational waves, etc.) and Big Data (including data acquisition, storage and analysis). Present astronomical databases and archives contain billions of objects observed in various wavelengths, both Galactic and extragalactic, and the vast amount of data on them allows new studies and discoveries. Astronomers deal with big numbers. Surveys are the main source for discovery of astronomical objects and accumulation of observational data for further analysis, interpretation, and achieving scientific results. We review the main characteristics of astronomical surveys, we compare photographic and digital eras of astronomical studies (including the development of wide-field observations), we give the present state of MW surveys, and we discuss the Big Data in astronomy and related topics of Virtual Observatories and Computational Astrophysics. The review includes many numbers and data that can be compared to have a possibly overall understanding on the studied Universe, cosmic numbers and their relationship to modern computational possibilities.

Author(s):  
Jamie Farnes ◽  
Ben Mort ◽  
Fred Dulwich ◽  
Stef Salvini ◽  
Wes Armour

The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of 5 zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources, and perform Faraday Tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.


2019 ◽  
Vol 15 (S367) ◽  
pp. 214-217
Author(s):  
A. M. Mickaelian ◽  
G. A. Mikayelyan

AbstractWe review Big Data in Astronomy and its role in Astronomy Education. At present all-sky and large-area astronomical surveys and their catalogued data span over the whole range of electromagnetic spectrum, from gamma-ray to radio, as well as most important surveys giving optical images, proper motions, variability and spectroscopic data. Most important astronomical databases and archives are presented as well. They are powerful sources for many-sided efficient research using the Virtual Observatory (VO) environment. It is shown that using and analysis of Big Data accumulated in astronomy lead to many new discoveries. Using these data gives a significant advantage for Astronomy Education due to its attractiveness and due to big interest of young generation to computer science and technologies. The Computer Science itself benefits from data coming from the Universe and a new interdisciplinary science Astroinformatics has been created to manage these data.


Author(s):  
Jamie Farnes ◽  
Ben Mort ◽  
Fred Dulwich ◽  
Stef Salvini ◽  
Wes Armour

The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of 5 zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources, and perform Faraday Tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.


2016 ◽  
Vol 25 (1) ◽  
Author(s):  
Areg M. Mickaelian

AbstractRecent all-sky and large-area astronomical surveys and their catalogued data over the whole range of electromagnetic spectrum, from γ-rays to radio waves, are reviewed, including such as Fermi-GLAST and INTEGRAL in γ-ray, ROSAT, XMM and Chandra in X-ray, GALEX in UV, SDSS and several POSS I and POSS II-based catalogues (APM, MAPS, USNO, GSC) in the optical range, 2MASS in NIR, WISE and AKARI IRC in MIR, IRAS and AKARI FIS in FIR, NVSS and FIRST in radio range, and many others, as well as the most important surveys giving optical images (DSS I and II, SDSS, etc.), proper motions (Tycho, USNO, Gaia), variability (GCVS, NSVS, ASAS, Catalina, Pan-STARRS), and spectroscopic data (FBS, SBS, Case, HQS, HES, SDSS, CALIFA, GAMA). An overall understanding of the coverage along the whole wavelength range and comparisons between various surveys are given: galaxy redshift surveys, QSO/AGN, radio, Galactic structure, and Dark Energy surveys. Astronomy has entered the Big Data era, with Astrophysical Virtual Observatories and Computational Astrophysics playing an important role in using and analyzing big data for new discoveries.


Galaxies ◽  
2018 ◽  
Vol 6 (4) ◽  
pp. 120 ◽  
Author(s):  
Jamie Farnes ◽  
Ben Mort ◽  
Fred Dulwich ◽  
Stef Salvini ◽  
Wes Armour

The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of five zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources and perform Faraday tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
David S. Shiffman ◽  
Catherine C. Macdonald ◽  
S. Scott Wallace ◽  
Nicholas K. Dulvy

AbstractMany species of sharks are threatened with extinction, and there has been a longstanding debate in scientific and environmental circles over the most effective and appropriate strategy to conserve and protect them. Should we allow for sustainable fisheries exploitation of species which can withstand fishing pressure, or ban all fisheries for sharks and trade in shark products? In the developing world, exploitation of fisheries resources can be essential to food security and poverty alleviation, and global management efforts are typically focused on sustainably maximizing economic benefits. This approach aligns with traditional fisheries management and the perspectives of most surveyed scientific researchers who study sharks. However, in Europe and North America, sharks are increasingly venerated as wildlife to be preserved irrespective of conservation status, resulting in growing pressure to prohibit exploitation of sharks and trade in shark products. To understand the causes and significance of this divergence in goals, we surveyed 155 shark conservation focused environmental advocates from 78 environmental non-profits, and asked three key questions: (1) where do advocates get scientific information? (2) Does all policy-relevant scientific information reach advocates? and (3) Do advocates work towards the same policy goals identified by scientific researchers? Findings suggest many environmental advocates are aware of key scientific results and use science-based arguments in their advocacy, but a small but vocal subset of advocates report that they never read the scientific literature or speak to scientists. Engagement with science appears to be a key predictor of whether advocates support sustainable management of shark fisheries or bans on shark fishing and trade in shark products. Conservation is a normative discipline, and this analysis more clearly articulates two distinct perspectives in shark conservation. Most advocates support the same evidence-based policies as academic and government scientists, while a smaller percentage are driven more by moral and ethical beliefs and may not find scientific research relevant or persuasive. We also find possible evidence that a small group of non-profits may be misrepresenting the state of the science while claiming to use science-based arguments, a concern that has been raised by surveyed scientists about the environmental community. This analysis suggests possible alternative avenues for engaging diverse stakeholders in productive discussions about shark conservation.


Author(s):  
Gianfranco Bertone

The spectacular advances of modern astronomy have opened our horizon on an unexpected cosmos: a dark, mysterious Universe, populated by enigmatic entities we know very little about, like black holes, or nothing at all, like dark matter and dark energy. In this book, I discuss how the rise of a new discipline dubbed multimessenger astronomy is bringing about a revolution in our understanding of the cosmos, by combining the traditional approach based on the observation of light from celestial objects, with a new one based on other ‘messengers’—such as gravitational waves, neutrinos, and cosmic rays—that carry information from otherwise inaccessible corners of the Universe. Much has been written about the extraordinary potential of this new discipline, since the 2017 Nobel Prize in physics was awarded for the direct detection of gravitational waves. But here I will take a different angle and explore how gravitational waves and other messengers might help us break the stalemate that has been plaguing fundamental physics for four decades, and to consolidate the foundations of modern cosmology.


Publications ◽  
2018 ◽  
Vol 6 (3) ◽  
pp. 38 ◽  
Author(s):  
Jan Friesen ◽  
John Van Stan ◽  
Skander Elleuche

Scientists are trained to tell stories, scientific stories. Training is also needed to comprehend and contextualize these highly nuanced and technical stories because they are designed to explicitly convey scientific results, delineate their limitations, and describe a reproducible “plot” so that any thorough reenactment can achieve a similar conclusion. Although a carefully constructed scientific story may be crystal clear to other scientists in the same discipline, they are often inaccessible to broader audiences. This is problematic as scientists are increasingly expected to communicate their work to broader audiences that range from specialists in other disciplines to the general public. In fact, science communication is of increasing importance to acquire funding and generate effective outreach, as well as introduce, and sometimes even justify, research to society. This paper suggests a simple and flexible framework to translate a complex scientific publication into a broadly-accessible comic format. Examples are given for embedding scientific details into an easy-to-understand storyline. A background story is developed and panels are generated that convey scientific information via plain language coupled with recurring comic elements to maximize comprehension and memorability. This methodology is an attempt to alleviate the inherent limitations of interdisciplinary and public comprehension that result from standard scientific publication and dissemination practices. We also hope that this methodology will help colleagues enter into the field of science comics.


2021 ◽  
Vol 182 (2) ◽  
pp. 111-179
Author(s):  
Zaineb Chelly Dagdia ◽  
Christine Zarges

In the context of big data, granular computing has recently been implemented by some mathematical tools, especially Rough Set Theory (RST). As a key topic of rough set theory, feature selection has been investigated to adapt the related granular concepts of RST to deal with large amounts of data, leading to the development of the distributed RST version. However, despite of its scalability, the distributed RST version faces a key challenge tied to the partitioning of the feature search space in the distributed environment while guaranteeing data dependency. Therefore, in this manuscript, we propose a new distributed RST version based on Locality Sensitive Hashing (LSH), named LSH-dRST, for big data feature selection. LSH-dRST uses LSH to match similar features into the same bucket and maps the generated buckets into partitions to enable the splitting of the universe in a more efficient way. More precisely, in this paper, we perform a detailed analysis of the performance of LSH-dRST by comparing it to the standard distributed RST version, which is based on a random partitioning of the universe. We demonstrate that our LSH-dRST is scalable when dealing with large amounts of data. We also demonstrate that LSH-dRST ensures the partitioning of the high dimensional feature search space in a more reliable way; hence better preserving data dependency in the distributed environment and ensuring a lower computational cost.


Sign in / Sign up

Export Citation Format

Share Document