Demystifying Big Data in the Cloud

Author(s):  
Gebeyehu Belay Gebremeskel ◽  
Yi Chai ◽  
Zhongshi He

Big data in the cloud are an emerging paradigm for huge and federated data processing, storing and distributing by deploying web applications. Scalability, elasticity, pay-per-use pricing, and an advance of ICT scale from large and dynamic applications and performance are the major reasons for the success and widespread adoption of big data cloud infrastructures. It is ‘no secret of the enterprise data', which is challenging for privacy and security. In this chapter, authors deeply discussed and introduce novel approaches and methodologies to easily understood big data phenomenon and technology towards data or web resources privacy and security. Nutshell, big data has a powerful potential to predict cloud risks to develop and deploy corporate security strategies. The chapter's contribution is, in general, to gain a meaningful insight of big data in the cloud and its applications, which is hot issues for today's businesses to make proactive and knowledge-driven decisions.

2016 ◽  
pp. 2001-2031
Author(s):  
Gebeyehu Belay Gebremeskel ◽  
Yi Chai ◽  
Zhongshi He

Big data in the cloud are an emerging paradigm for huge and federated data processing, storing and distributing by deploying web applications. Scalability, elasticity, pay-per-use pricing, and an advance of ICT scale from large and dynamic applications and performance are the major reasons for the success and widespread adoption of big data cloud infrastructures. It is ‘no secret of the enterprise data', which is challenging for privacy and security. In this chapter, authors deeply discussed and introduce novel approaches and methodologies to easily understood big data phenomenon and technology towards data or web resources privacy and security. Nutshell, big data has a powerful potential to predict cloud risks to develop and deploy corporate security strategies. The chapter's contribution is, in general, to gain a meaningful insight of big data in the cloud and its applications, which is hot issues for today's businesses to make proactive and knowledge-driven decisions.


Author(s):  
David Haynes ◽  
Philip Mitchell ◽  
Eric Shook

Technologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allows researchers to integrate and uncover new knowledge about geospatial patterns and processes. However, we are also at a critical moment, as we have an ever-growing number of big data platforms that are being co-opted to support spatial analysis. A gap in the literature is the lack of a robust framework to assess the capabilities of geospatial analysis on big data platforms. This research begins to address this issue by establishing a geospatial benchmark that employs freely accessible datasets to provide a comprehensive comparison across big data platforms. The benchmark is a critical for evaluating the performance of spatial operations on big data platforms. It provides a common framework to compare existing platforms as well as evaluate new platforms. The benchmark is applied to three big data platforms and reports computing times and performance bottlenecks so that GIScientists can make informed choices regarding the performance of each platform. Each platform is evaluated for five raster operations: pixel count, reclassification, raster add, focal averaging, and zonal statistics using three different datasets.


Author(s):  
Yu-Che Chen ◽  
Tsui-Chuan Hsieh

“Big data” is one of the emerging and critical issues facing government in the digital age. This study first delineates the defining features of big data (volume, velocity, and variety) and proposes a big data typology that is suitable for the public sector. This study then examines the opportunities of big data in generating business analytics to promote better utilization of information and communication technology (ICT) resources and improved personalization of e-government services. Moreover, it discusses the big data management challenges in building appropriate governance structure, integrating diverse data sources, managing digital privacy and security risks, and acquiring big data talent and tools. An effective big data management strategy to address these challenges should develop a stakeholder-focused and performance-oriented governance structure and build capacity for data management and business analytics as well as leverage and prioritize big data assets for performance. In addition, this study illustrates the opportunities, challenges, and strategy for big service data in government with the E-housekeeper program in Taiwan. This brief case study offers insight into the implementation of big data for improving government information and services. This article concludes with the main findings and topics of future research in big data for public administration.


2021 ◽  
Vol 11 (15) ◽  
pp. 7033
Author(s):  
Oscar Ceballos ◽  
Carlos Alberto Ramírez Restrepo ◽  
María Constanza Pabón ◽  
Andres M. Castillo ◽  
Oscar Corcho

Existing SPARQL query engines and triple stores are continuously improved to handle more massive datasets. Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the MapReduce Programming Model and Hadoop-based ecosystems. New trends in Big Data technologies have also emerged (e.g., Apache Spark, Apache Flink); they use distributed in-memory processing and promise to deliver higher data processing performance. In this paper, we present a formal interpretation of some PACT transformations implemented in the Apache Flink DataSet API. We use this formalization to provide a mapping to translate a SPARQL query to a Flink program. The mapping was implemented in a prototype used to determine the correctness and performance of the solution. The source code of the project is available in Github under the MIT license.


2020 ◽  
Vol 9 (11) ◽  
pp. 690
Author(s):  
David Haynes ◽  
Philip Mitchell ◽  
Eric Shook

Technologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allow researchers to integrate and uncover new knowledge about geospatial patterns and processes. However, we are at a critical moment, as we have an ever-growing number of big data platforms that are being co-opted to support spatial analysis. A gap in the literature is the lack of a robust assessment comparing the efficiency of raster data analysis on big data platforms. This research begins to address this issue by establishing a raster data benchmark that employs freely accessible datasets to provide a comprehensive performance evaluation and comparison of raster operations on big data platforms. The benchmark is critical for evaluating the performance of spatial operations on big data platforms. The benchmarking datasets and operations are applied to three big data platforms. We report computing times and performance bottlenecks so that GIScientists can make informed choices regarding the performance of each platform. Each platform is evaluated for five raster operations: pixel count, reclassification, raster add, focal averaging, and zonal statistics using three raster different datasets.


2015 ◽  
pp. 1394-1407 ◽  
Author(s):  
Yu-Che Chen ◽  
Tsui-Chuan Hsieh

“Big data” is one of the emerging and critical issues facing government in the digital age. This study first delineates the defining features of big data (volume, velocity, and variety) and proposes a big data typology that is suitable for the public sector. This study then examines the opportunities of big data in generating business analytics to promote better utilization of information and communication technology (ICT) resources and improved personalization of e-government services. Moreover, it discusses the big data management challenges in building appropriate governance structure, integrating diverse data sources, managing digital privacy and security risks, and acquiring big data talent and tools. An effective big data management strategy to address these challenges should develop a stakeholder-focused and performance-oriented governance structure and build capacity for data management and business analytics as well as leverage and prioritize big data assets for performance. In addition, this study illustrates the opportunities, challenges, and strategy for big service data in government with the E-housekeeper program in Taiwan. This brief case study offers insight into the implementation of big data for improving government information and services. This article concludes with the main findings and topics of future research in big data for public administration.


2019 ◽  
Vol 12 (1) ◽  
pp. 42 ◽  
Author(s):  
Andrey I. Vlasov ◽  
Konstantin A. Muraviev ◽  
Alexandra A. Prudius ◽  
Demid A. Uzenkov

Sign in / Sign up

Export Citation Format

Share Document