Performance Analysis of Main Public Cloud Big Data Services Processing Brazilian Government Data

Author(s):  
Leonardo Rebouças de Carvalho ◽  
Marcelo Augusto da Cruz Motta ◽  
Aleteia Patricia Favacho de Araújo
2019 ◽  
Vol 8 (2S11) ◽  
pp. 3606-3611

Big data privacy has assumed importance as the cloud computing became a phenomenal success in providing a remote platform for sharing computing resources without geographical and time restrictions. However, the privacy concerns on the big data being outsourced to public cloud storage are still exist. Different anonymity or sanitization techniques came into existence for protecting big data from privacy attacks. In our prior works, we have proposed a misusability probability based metric to know the probable percentage of misusability. We additionally planned a system that suggests level of sanitization before actually applying privacy protection to big data. It was based on misusability probability. In this paper, our focus is on further evaluation of our misuse probability based sanitization of big data approach by defining an algorithm which willanalyse the trade-offs between misuse probability and level of sanitization. It throws light into the proposed framework and misusability measure besides evaluation of the framework with an empirical study. Empirical study is made in public cloud environment with Amazon EC2 (compute engine), S3 (storage service) and EMR (MapReduce framework). The experimental results revealed the dynamics of the trade-offs between them. The insights help in making well informed decisions while sanitizing big data to ensure that it is protected without losing utility required.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Hadi Masoumi ◽  
Bahar Farahani ◽  
Fereidoon Shams Aliee

Purpose Open government data (OGD) has emerged as a radical paradigm shift and endeavor among government administrations across the world mainly due to its promises of transparency, accountability, public-private collaboration, civic participation, social innovation and data-driven value creation. Complexity, cross-cutting nature, diversity of data sets, interoperability and quality issues usually hamper unlocking the full potential value of data. To tackle these challenges, this paper aims to provide a novel solution using a top-down approach. Design/methodology/approach In this paper, the authors propose a systematic ontology-based approach combined with a novel architecture and its corresponding processes enabling organizations to carry out all the steps in the OGD value chain. In addition, an OGD Platform including a portal (www.iranopendata.ir) and a data management system (www.ogdms.iranopendata.ir) are developed to showcase the proposed solution. Findings The efficiency and the applicability of the solution are evaluated by a real-life use case on energy consumption of the buildings of the city of Tehran, Iran. Finally, a comparison was made with existing solutions, and the results show the proposed approach is able to address the existing gaps in the literature. Originality/value The results imply that modeling and designing the data model, as well as exploiting an ontology-based approach are critical pillars to create rich, relevant and well-described OGD data sets. Moreover, clarity on processes, roles and responsibilities are the key factors influencing the quality of the published data services. Thus, to the best of the knowledge, this is the first study that exploits and considers an ontology-based approach in a top-down manner to create OGD data sets.


2019 ◽  
pp. 346-375
Author(s):  
Jens Kohler ◽  
Christian Richard Lorenz ◽  
Markus Gumbel ◽  
Thomas Specht ◽  
Kiril Simov

In recent years, Cloud Computing has drastically changed IT-Architectures in enterprises throughout various branches and countries. Dynamically scalable capabilities like CPUs, storage space, virtual networks, etc. promise cost savings, as huge initial infrastructure investments are not required anymore. This development shows that Cloud Computing is also a promising technology driver for Big Data, as the storage of unstructured data when no concrete and defined data schemes (variety) can be managed with upcoming NoSQL architectures. However, in order to fully exploit these advantages, the integration of a trustworthy 3rd party public cloud provider is necessary. Thus, challenging questions concerning security, compliance, anonymization, and privacy emerge and are still unsolved. To address these challenges, this work presents, implements and evaluates a security-by-distribution approach for NoSQL document stores that distributes data across various cloud providers such that every provider only gets a small data chunk which is worthless without the others.


Sign in / Sign up

Export Citation Format

Share Document