Space-Time Analytics for Spatial Dynamics

Data Mining ◽  
2013 ◽  
pp. 2117-2131
Author(s):  
May Yuan ◽  
James Bothwell

The so-called Big Data Challenge poses not only issues with massive volumes of data, but issues with the continuing data streams from multiple sources that monitor environmental processes or record social activities. Many statistics tools and data mining methods have been developed to reveal embedded patterns in large data sets. While patterns are critical to data analysis, deep insights will remain buried unless we develop means to associate spatiotemporal patterns to the dynamics of spatial processes that essentially drive the formation of patterns in the data. This chapter reviews the literature with the conceptual foundation for space-time analytics dealing with spatial processes, discusses the types of dynamics that have and have not been addressed in the literature, and identifies needs for new thinking that can systematically advance space-time analytics to reveal dynamics of spatial processes. The discussion is facilitated by an example to highlight potential means of space-time analytics in response to the Big Data Challenge. The example shows the development of new space-time concepts and tools to analyze data from two common General Circulation Models for climate change predictions. Common approaches compare temperature changes at locations from the NCAR CCSM3 and from the CNRM CM3 or animate time series of temperature layers to visualize the climate prediction. Instead, new space-time analytics methods are shown here the ability to decipher the differences in spatial dynamics of the predicted temperature change in the model outputs and apply the concepts of change and movement to reveal warming, cooling, convergence, and divergence in temperature change across the globe.

Author(s):  
May Yuan ◽  
James Bothwell

The so-called Big Data Challenge poses not only issues with massive volumes of data, but issues with the continuing data streams from multiple sources that monitor environmental processes or record social activities. Many statistics tools and data mining methods have been developed to reveal embedded patterns in large data sets. While patterns are critical to data analysis, deep insights will remain buried unless we develop means to associate spatiotemporal patterns to the dynamics of spatial processes that essentially drive the formation of patterns in the data. This chapter reviews the literature with the conceptual foundation for space-time analytics dealing with spatial processes, discusses the types of dynamics that have and have not been addressed in the literature, and identifies needs for new thinking that can systematically advance space-time analytics to reveal dynamics of spatial processes. The discussion is facilitated by an example to highlight potential means of space-time analytics in response to the Big Data Challenge. The example shows the development of new space-time concepts and tools to analyze data from two common General Circulation Models for climate change predictions. Common approaches compare temperature changes at locations from the NCAR CCSM3 and from the CNRM CM3 or animate time series of temperature layers to visualize the climate prediction. Instead, new space-time analytics methods are shown here the ability to decipher the differences in spatial dynamics of the predicted temperature change in the model outputs and apply the concepts of change and movement to reveal warming, cooling, convergence, and divergence in temperature change across the globe.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

AbstractData variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked in previous works. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2020 ◽  
Vol 39 (10) ◽  
pp. 753-754
Author(s):  
Jiajia Sun ◽  
Daniele Colombo ◽  
Yaoguo Li ◽  
Jeffrey Shragge

Geophysicists seek to extract useful and potentially actionable information about the subsurface by interpreting various types of geophysical data together with prior geologic information. It is well recognized that reliable imaging, characterization, and monitoring of subsurface systems require integration of multiple sources of information from a multitude of geoscientific data sets. With increasing data volumes and computational power, new data types, constant development of inversion algorithms, and the advent of the big data era, Geophysics editors see multiphysics integration as an effective means of meeting some of the challenges arising from imaging subsurface systems with higher resolution and reliability as well as exploring geologically more complicated areas. To advance the field of multiphysics integration and to showcase its added value, Geophysics will introduce a new section “Multiphysics and Joint Inversion” in 2021. Submissions are accepted now.


2021 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2020 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2020 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked from previous work. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


2020 ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan Foroutan ◽  
Mahmood Fathy

Abstract Data variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in consumption of processing resources such as CPU consumption. In this paper, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider a deadline as our constraint and before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. We have used a set of data sets and applications in the evaluation phase. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


Data sizes have been growing exponentially within many companies. Facing this size of data—meta tagged piecemeal, produced in real-time, and arrives in continuous streams from multiple sources—analyzing the data to spot patterns and extract useful information is harder still. This includes the ever-changing landscape of data and their associated characteristics, evolving data analysis paradigms, challenges of computational infrastructure, data quality, complexity, and protection in addition to the data sharing and access, and—crucially—our ability to integrate data sets and their analysis toward an improved understanding. In this context, this second chapter will cover the issues and challenges that are hiding behind the 3Vs phenomenon. It gives a platform to complete the first chapter and proceed to different big data issues and challenges and how to tackle them in the dynamic processes.


2014 ◽  
Author(s):  
Pankaj K. Agarwal ◽  
Thomas Moelhave
Keyword(s):  
Big Data ◽  

2020 ◽  
Vol 13 (4) ◽  
pp. 790-797
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh Dhiman

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.


Sign in / Sign up

Export Citation Format

Share Document