Trust, but Verify: Optimistic Visualizations of Approximate Queries for Exploring Big Data

Mapping Intimacies ◽

10.31219/osf.io/tfwqj ◽

2018 ◽

Cited By ~ 2

Author(s):

Dominik Moritz ◽

Danyel Fisher ◽

Bolin Ding ◽

Chi Wang

Keyword(s):

Big Data ◽

Case Studies ◽

User Experience ◽

Laboratory Study ◽

Exploratory Analysis ◽

Data Systems ◽

Preliminary Results ◽

Visualization Systems ◽

Big Data Systems ◽

The Cost

Analysts need interactive speed for exploratory analysis, but big data systems are often slow. With sampling, data systems can produce approximate answers fast enough for exploratory visualization, at the cost of accuracy and trust. We propose optimistic visualization, which approaches these issues from a user experience perspective. This method lets analysts explore approximate results interactively, and provides a way to detect and recover from errors later. Pangloss implements these ideas. We discuss design issues raised by optimistic visualization systems. We test this concept with five expert visualizers in a laboratory study and three case studies at Microsoft. Analysts reported that they felt more confident in their results, and used optimistic visualization to check that their preliminary results were correct.

Download Full-text

“Image As Big Data” Systems: Some Case Studies

Pro Hadoop Data Analytics ◽

10.1007/978-1-4842-1910-2_14 ◽

2016 ◽

pp. 235-255

Author(s):

Kerry Koitzsch

Keyword(s):

Big Data ◽

Case Studies ◽

Data Systems ◽

Big Data Systems

Download Full-text

The Heterogeneity Paradigm in Big Data Architectures

Advances in Data Mining and Database Management - Managing and Processing Big Data in Cloud Computing ◽

10.4018/978-1-4666-9767-6.ch015 ◽

2016 ◽

pp. 218-245 ◽

Cited By ~ 1

Author(s):

Todor Ivanov ◽

Sead Izberovic ◽

Nikolaos Korfiatis

Keyword(s):

Big Data ◽

Extended Model ◽

Data Systems ◽

Cost Factor ◽

Management Platform ◽

Architectural Patterns ◽

Hadoop Ecosystem ◽

Big Data Systems ◽

The Cost

This chapter introduces the concept of heterogeneity as a perspective in the architecture of big data systems targeted to both vertical and generic workloads and discusses how this can be linked with the existing Hadoop ecosystem (as of 2015). The case of the cost factor of a big data solution and its characteristics can influence its architectural patterns and capabilities and as such an extended model based on the 3V paradigm is introduced (Extended 3V). This is examined on a hierarchical set of four layers (Hardware, Management, Platform and Application). A list of components is provided on each layer as well as a classification of their role in a big data solution.

Download Full-text

Toward Efficient Ranked-key Algorithm for the Web notification of Big Data Systems

Proceedings of the 2nd international Conference on Big Data, Cloud and Applications ◽

10.1145/3090354.3090386 ◽

2017 ◽

Author(s):

Mohamedou Cheikh Tourad ◽

Abdelmounaim Abdali

Keyword(s):

Big Data ◽

Data Systems ◽

Big Data Systems ◽

The Web

Download Full-text

Runtime Performance Challenges in Big Data Systems

Proceedings of the 2015 Workshop on Challenges in Performance Methods for Software Development - WOSP '15 ◽

10.1145/2693561.2693563 ◽

2015 ◽

Cited By ~ 8

Author(s):

John Klein ◽

Ian Gorton

Keyword(s):

Big Data ◽

Data Systems ◽

Runtime Performance ◽

Big Data Systems ◽

Performance Challenges

Download Full-text

A standard for benchmarking big data systems

2014 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2014.7004472 ◽

2014 ◽

Cited By ~ 7

Author(s):

Raghunath Nambiar

Keyword(s):

Big Data ◽

Data Systems ◽

Big Data Systems

Download Full-text

A reference architecture for big data systems in the national security domain

Proceedings of the 2nd International Workshop on BIG Data Software Engineering - BIGDSE '16 ◽

10.1145/2896825.2896834 ◽

2016 ◽

Cited By ~ 14

Author(s):

John Klein ◽

Ross Buglak ◽

David Blockow ◽

Troy Wuttke ◽

Brenton Cooper

Keyword(s):

Big Data ◽

National Security ◽

Reference Architecture ◽

Data Systems ◽

Big Data Systems

Download Full-text

Remote Procedure Call Optimization of Big Data Systems Based on Data Awareness

2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom) ◽

10.1109/ispa-bdcloud-socialcom-sustaincom51426.2020.00081 ◽

2020 ◽

Author(s):

Jin Wang ◽

Yaqiong Yang ◽

Jingyu Zhang ◽

Lei Wang

Keyword(s):

Big Data ◽

Remote Procedure Call ◽

Data Systems ◽

Procedure Call ◽

Big Data Systems

Download Full-text

Modern Method and Software Tool for Guaranteed Data Deletion in Advanced Big Data Systems

Advances in Artificial Systems for Medicine and Education II - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-12082-5_53 ◽

2019 ◽

pp. 581-590

Author(s):

Sergiy Gnatyuk ◽

Vasyl Kinzeryavyy ◽

Tetyana Sapozhnik ◽

Iryna Sopilko ◽

Nurgul Seilova ◽

...

Keyword(s):

Big Data ◽

Software Tool ◽

Modern Method ◽

Data Systems ◽

Big Data Systems

Download Full-text

A Systematic Mapping of Software Engineering Approaches to Develop Big Data Systems

2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA) ◽

10.1109/seaa.2018.00079 ◽

2018 ◽

Cited By ~ 2

Author(s):

Rodrigo Laigner ◽

Marcos Kalinowski ◽

Sergio Lifschitz ◽

Rodrigo Salvador Monteiro ◽

Daniel de Oliveira

Keyword(s):

Big Data ◽

Software Engineering ◽

Data Systems ◽

Systematic Mapping ◽

Big Data Systems

Download Full-text

Using a distributed deep learning algorithm for analyzing big data in smart cities

Smart and Sustainable Built Environment ◽

10.1108/sasbe-04-2019-0040 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Mohammed Anouar Naoui ◽

Brahim Lejdel ◽

Mouloud Ayad ◽

Abdelfattah Amamra ◽

Okba kazar

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Analysis ◽

Data Storage ◽

Smart City ◽

Smart Cities ◽

Smart Environment ◽

Data Systems ◽

Content Type ◽

Big Data Systems

PurposeThe purpose of this paper is to propose a distributed deep learning architecture for smart cities in big data systems.Design/methodology/approachWe have proposed an architectural multilayer to describe the distributed deep learning for smart cities in big data systems. The components of our system are Smart city layer, big data layer, and deep learning layer. The Smart city layer responsible for the question of Smart city components, its Internet of things, sensors and effectors, and its integration in the system, big data layer concerns data characteristics 10, and its distribution over the system. The deep learning layer is the model of our system. It is responsible for data analysis.FindingsWe apply our proposed architecture in a Smart environment and Smart energy. 10; In a Smart environment, we study the Toluene forecasting in Madrid Smart city. For Smart energy, we study wind energy foresting in Australia. Our proposed architecture can reduce the time of execution and improve the deep learning model, such as Long Term Short Memory10;.Research limitations/implicationsThis research needs the application of other deep learning models, such as convolution neuronal network and autoencoder.Practical implicationsFindings of the research will be helpful in Smart city architecture. It can provide a clear view into a Smart city, data storage, and data analysis. The 10; Toluene forecasting in a Smart environment can help the decision-maker to ensure environmental safety. The Smart energy of our proposed model can give a clear prediction of power generation.Originality/valueThe findings of this study are expected to contribute valuable information to decision-makers for a better understanding of the key to Smart city architecture. Its relation with data storage, processing, and data analysis.

Download Full-text