scholarly journals The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch

2010 ◽  
Vol 2010 ◽  
pp. 1-16 ◽  
Author(s):  
Meyer Z. Pesenson ◽  
Isaac Z. Pesenson ◽  
Bruce McCollum

Recent and forthcoming advances in instrumentation, and giant new surveys, are creating astronomical data sets that are not amenable to the methods of analysis familiar to astronomers. Traditional methods are often inadequate not merely because of the size in bytes of the data sets, but also because of the complexity of modern data sets. Mathematical limitations of familiar algorithms and techniques in dealing with such data sets create a critical need fornew paradigmsfor the representation, analysis and scientific visualization (as opposed to illustrative visualization) of heterogeneous, multiresolution data across application domains. Some of the problems presented by the new data sets have been addressed by other disciplines such as applied mathematics, statistics and machine learning and have been utilized by other sciences such as space-based geosciences. Unfortunately, valuable results pertaining to these problems are mostly to be found in publications outside of astronomy. Here we offer brief overviews of a number of concepts, techniques and developments that are vital to the analysis and visualization of complex datasets and images. One of the goals of this paper is to help bridge the gap between applied mathematics and artificial intelligence on the one side and astronomy on the other.

2016 ◽  
Vol 38 (4) ◽  
pp. B521-B538 ◽  
Author(s):  
Paris Perdikaris ◽  
Daniele Venturi ◽  
George Em Karniadakis

Author(s):  
DIANXUN SHUAI ◽  
XUE FANGLIANG

Data clustering has been widely used in many areas, such as data mining, statistics, machine learning and so on. A variety of clustering approaches have been proposed so far, but most of them are not qualified to quickly cluster a large-scale high-dimensional database. This paper is devoted to a novel data clustering approach based on a generalized particle model (GPM). The GPM transforms the data clustering process into a stochastic process over the configuration space on a GPM array. The proposed approach is characterized by the self-organizing clustering and many advantages in terms of the insensitivity to noise, quality robustness to clustered data, suitability for high-dimensional and massive data sets, learning ability, openness and easier hardware implementation with the VLSI systolic technology. The analysis and simulations have shown the effectiveness and good performance of the proposed GPM approach to data clustering.


Author(s):  
A Salman Avestimehr ◽  
Seyed Mohammadreza Mousavi Kalan ◽  
Mahdi Soltanolkotabi

Abstract Dealing with the shear size and complexity of today’s massive data sets requires computational platforms that can analyze data in a parallelized and distributed fashion. A major bottleneck that arises in such modern distributed computing environments is that some of the worker nodes may run slow. These nodes a.k.a. stragglers can significantly slow down computation as the slowest node may dictate the overall computational time. A recent computational framework, called encoded optimization, creates redundancy in the data to mitigate the effect of stragglers. In this paper, we develop novel mathematical understanding for this framework demonstrating its effectiveness in much broader settings than was previously understood. We also analyze the convergence behavior of iterative encoded optimization algorithms, allowing us to characterize fundamental trade-offs between convergence rate, size of data set, accuracy, computational load (or data redundancy) and straggler toleration in this framework.


2021 ◽  
Vol 48 (4) ◽  
pp. 307-328
Author(s):  
Dominic Farace ◽  
Hélène Prost ◽  
Antonella Zane ◽  
Birger Hjørland ◽  
◽  
...  

This article presents and discusses different kinds of data documents, including data sets, data studies, data papers and data journals. It provides descriptive and bibliometric data on different kinds of data documents and discusses the theoretical and philosophical problems by classifying documents according to the DIKW model (data documents, information documents, knowl­edge documents and wisdom documents). Data documents are, on the one hand, an established category today, even with its own data citation index (DCI). On the other hand, data documents have blurred boundaries in relation to other kinds of documents and seem sometimes to be understood from the problematic philosophical assumption that a datum can be understood as “a single, fixed truth, valid for everyone, everywhere, at all times”


JOGED ◽  
2017 ◽  
Vol 7 (2) ◽  
Author(s):  
Dewi Sinta Fajawati

Bulan merupakan sumber inspiratif dalam penggarapan karya tari ini. Secara ilmu pengetahuan, Bulan adalah benda langit yang disebut satelit, satelit satu-satunya yang dimiliki Bumi dan tercipta secara alami. Banyak teori yang mengatakan tentang terbentuknya Bulan, salah satunya adalah teori Big bang atau dentuman besar. Pada dasarnya Bulan hanyalah sebuah Benda besar berbentuk bulat yang tidak bisa bercahaya, cahaya yang kita lihat pada malam hari merupakan refleksi dari cahaya matahari. Akan tetapi keindahannya memang tidak bisa dipungkiri, karena dia paling bercahaya diantara hamparan langit yang gelap. Cahayanya tidak selalu terang, bahkan tidak selalu bulat, terkadang hanya terlihat setengah atau terlihat seperti sabit..            Penata tari memetaforakan objek bulan yang berada di tempat yang sangat tinggi sebagai sebuah cita-cita yang ingin dicapai. Seringkali lagu anak-anak yang menjadi pengalaman auditif penata tari, menjadikan bulan sebagai objek yang ingin digapai, misal lagu ‘Ambilkan Bulan Bu’. Namun intisari yang akan dipakai dalam penggarapan koregrafinya adalah tentang fase bulan yang tercipta. Bersumber dari rangsang awal melihat bulan atau rangsang visual, penata tari menginterpretasikan fase-fase bulan yang terjadi sebagai fase kehidupan yang dijalani untuk menggapai sebuah cita-cita tersebut.            Koreografi diwujudkan dalam bentuk kelompok dengan membagi dua karate penari. Delapan penari merupakan simbolisasi Bulan, dan satu penari sebagai manusia yang bercita-cita. Dengan bentuk tari dramatik, penyajiannya dibagi menjadi 5 adegan, yaitu Introduksi Big bang, Adegan 1 Moon happen, Adegan 2 Mengejar Impian, Adegan 3 Dancing with Moon, dan Ending ‘Catch Your Dream’. The moon is the essential inspirations of this choreograph. Theoretically, the moon is a sky object which is called as satellite. The one and only naturally created satellite belongs to the planet Earth. There are many theories that explain how the moon was created. One of those theories is Big Bang theory or massive crash. Basically, the moon is just a huge circle thing which is unable to shine its glow. The light that we experience in the evening is the reflection of the sun. However, thebeauty of the moonlight is undeniable as it has the significant light within the darkest night sky. Its light is not always the strongest, even it’s not always circle (full), every so often it is seemed only the half part of it or crescent moon.            The choreographer interpreted the moon that belongs in the highest as the goals that she wants to reach. Most of the time, the children songs (lullaby) that pick the moon as the main object that is desired to be reached, for example the song “Ambilkan Bulan, Bu”. The essential idea that is explored in this choreograph is the creational phase of the moon itself. It was started by way of visual reaction when the choreographer observed the moon, she interpret the moon’s phases as the phases in human’s life which are gone through to reaching their goals. Fall and recovery, passionate, and even sometimes they give it in, are interpreted from the moonlight. The full moon which has the brightest and the most perfect light is likened as the strong spirit. The crescent moon with its soft light is interpreted as low spirit and unconfident.             This in-group-choreograph is separated into two characters with 8 female dancers that are the symbolization of the moon and the other one female dancer symbolizes a human with aspire. With dramatic dance form, this choreograph is presented into five parts, including introduction part of Big Bang, Moon Happen in part one, Chasing Dream is part two, Dancing With The Moon in part three, Catch Your Dream in the ending part.


2010 ◽  
Vol 23 ◽  
pp. 113-117
Author(s):  
A. Orphanou ◽  
K. Nicolaides ◽  
D. Charalambous ◽  
P. Lingis ◽  
S. C. Michaelides

Abstract. In the present study, the monthly statistical characteristics of jetlet and tropopause in relation to the development of thunderstorms over Cyprus are examined. For the needs of the study the 12:00 UTC radiosonde data obtained from the Athalassa station (33.4° E, 35.1° N) for an 11-year period, from 1997 till 2007, were employed. On the basis of this dataset, the height and the temperature of the tropopause, as well as the height, wind direction and speed of the jetlet were estimated. Additionally, the days in the above period with observed thunderstorms were selected and the aforementioned characteristics of the jetlet and tropopause were noted. The two data sets were subsequently contrasted in an attempt to identify possible relations between thunderstorm development, on the one hand, and tropopause and jetlet characteristics, on the other hand.


Author(s):  
Sylvia L. Osborn

With the widespread use of online systems, there is an increasing focus on maintaining the privacy of individuals and information about them. This is often referred to as a need for privacy protection. The author briefly examines definitions of privacy in this context, roughly delineating between keeping facts private and statistical privacy that deals with what can be inferred from data sets. Many of the mechanisms used to implement what is commonly thought of as access control are the same ones used to protect privacy. This chapter explores when this is not the case and, in general, the interplay between privacy and access control on the one hand and, on the other hand, the separation of these models from mechanisms for their implementation.


2022 ◽  
pp. 41-67
Author(s):  
Vo Ngoc Phu ◽  
Vo Thi Ngoc Tran

Machine learning (ML), neural network (NN), evolutionary algorithm (EA), fuzzy systems (FSs), as well as computer science have been very famous and very significant for many years. They have been applied to many different areas. They have contributed much to developments of many large-scale corporations, massive organizations, etc. Lots of information and massive data sets (MDSs) have been generated from these big corporations, organizations, etc. These big data sets (BDSs) have been the challenges of many commercial applications, researches, etc. Therefore, there have been many algorithms of the ML, the NN, the EA, the FSs, as well as computer science which have been developed to handle these massive data sets successfully. To support for this process, the authors have displayed all the possible algorithms of the NN for the large-scale data sets (LSDSs) successfully in this chapter. Finally, they have presented a novel model of the NN for the BDS in a sequential environment (SE) and a distributed network environment (DNE).


Sign in / Sign up

Export Citation Format

Share Document