An Approach to Standalone Provenance Systems for Big Social Provenance Data

Author(s):  
Yucel Tas ◽  
Mohamed Jehad Baeth ◽  
Mehmet S. Aktas
2015 ◽  
Vol 26 (2) ◽  
pp. 32-47 ◽  
Author(s):  
Salmin Sultana ◽  
Elisa Bertino

Existing provenance systems operate at a single layer of abstraction (workflow/process/OS) at which they record and store provenance. However, the provenance captured from different layers provides the highest benefit when integrated through a unified provenance framework. To build such a framework, a comprehensive provenance model able to represent the provenance of data objects with various semantics and granularity is the first step. In this paper, the authors propose a provenance model able to represent the provenance of any data object captured at any abstraction layer and present an abstract schema of the model. The expressive nature of the model enables a wide range of provenance queries. The authors also illustrate the utility of their model in real world data processing systems. In the paper, they also introduce a data provenance distributed middleware system composed of several different components and services that capture provenance according to their model and securely stores it in a central repository. As part of our middleware, the authors present a thin stackable file system, called FiPS, for capturing local provenance in a portable manner. FiPS is able to capture provenance at various degrees of granularity, transform provenance records into secure information, and direct the resulting provenance data to various persistent storage systems.


2012 ◽  
Vol 23 (1) ◽  
pp. 70-92 ◽  
Author(s):  
Clarus J. Backes ◽  
David Cheetham ◽  
Hector Neff

AbstractRecent research and debates regarding the origin and spread of Olmec iconography during the Early Formative have centered on provenance and stylistic analyses of carved and incised pottery. Studies by instrumental neutron activation analysis (INAA) have indicated that Gulf Coast-style carved-incised pots were exported extensively from the area of the first Olmec capital, San Lorenzo, to several other regions of Mesoamerica. More recently, excavations at the Pacific Coast site of Cantón Corralito have shown that carved-incised pottery and other Olmec-style artifacts dominate strata contemporary with Early Olmec, suggesting the site may represent a settlement enclave of Gulf Olmec peoples. In this study we provide additional evidence of exchange between the Gulf Olmec and the Pacific Coast region by using laser ablation time-of-flight inductively coupled plasma mass spectrometry (LA-TOF-ICP-MS) to characterize hematite-based paints on Olmec-style pottery from Cantón Corralito, and to compare these paints to raw hematite recovered from Cantón Corralito and San Lorenzo. When examined in combination with sherd provenance data, the LA-TOF-ICP-MS data demonstrate that Olmec vessels were decorated in the San Lorenzo region before being exported to the Pacific Coast, and that Gulf Coast hematite was exported to Cantón Corralito, where it was used to enhance Olmec-style symbolism on locally produced vessels.


2020 ◽  
Vol 14 (3) ◽  
pp. 391-403
Author(s):  
Dimitris Palyvos-Giannas ◽  
Bastian Havers ◽  
Marina Papatriantafilou ◽  
Vincenzo Gulisano

Data streaming enables online monitoring of large and continuous event streams in Cyber-Physical Systems (CPSs). In such scenarios, fine-grained backward provenance tools can connect streaming query results to the source data producing them, allowing analysts to study the dependency/causality of CPS events. While CPS monitoring commonly produces many events, backward provenance does not help prioritize event inspection since it does not specify if an event's provenance could still contribute to future results. To cover this gap, we introduce Ananke , a framework to extend any fine-grained backward provenance tool and deliver a live bipartite graph of fine-grained forward provenance. With Ananke , analysts can prioritize the analysis of provenance data based on whether such data is still potentially being processed by the monitoring queries. We prove our solution is correct, discuss multiple implementations, including one leveraging streaming APIs for parallel analysis, and show Ananke results in small overheads, close to those of existing tools for fine-grained backward provenance.


2018 ◽  
Vol 2 (CSCW) ◽  
pp. 1-25 ◽  
Author(s):  
Ella Tallyn ◽  
Larissa Pschetz ◽  
Rory Gianni ◽  
Chris Speed ◽  
Chris Elsden

Author(s):  
Navya Gouru ◽  
NagaLakshmi Vadlamani

The redesign of cloud storage with the amalgamation of cooperative cloud and an immutable and unhackable distributed database blockchain thrives towards a strong CIA triad and secured data provenance. The conspiracy ideology associated with the traditional cloud has economized with cooperative cloud storage like Storj and Sia, decentralized storage, which allows renting the unused hard drive space and getting monetary compensation in an exchange with cryptocurrency. In this article, the authors explain how confidentiality, integrity and availability can be progressed with cooperative cloud storage along with tamper-proof data provenance management with ethereum smart contracts using zero-knowledge proof (ZKP). A contemporary architecture is proposed with regards to storing data on the cooperative cloud and collecting and verifying the provenance data from the cloud and publishing the provenance data into blockchain network as transactions.


2008 ◽  
Vol 16 (2-3) ◽  
pp. 205-216
Author(s):  
Bartosz Balis ◽  
Marian Bubak ◽  
Bartłomiej Łabno

Scientific workflows are a means of conducting in silico experiments in modern computing infrastructures for e-Science, often built on top of Grids. Monitoring of Grid scientific workflows is essential not only for performance analysis but also to collect provenance data and gather feedback useful in future decisions, e.g., related to optimization of resource usage. In this paper, basic problems related to monitoring of Grid scientific workflows are discussed. Being highly distributed, loosely coupled in space and time, heterogeneous, and heavily using legacy codes, workflows are exceptionally challenging from the monitoring point of view. We propose a Grid monitoring architecture for scientific workflows. Monitoring data correlation problem is described and an algorithm for on-line distributed collection of monitoring data is proposed. We demonstrate a prototype implementation of the proposed workflow monitoring architecture, the GEMINI monitoring system, and its use for monitoring of a real-life scientific workflow.


2020 ◽  
Author(s):  
Kiran Gadhave ◽  
Jochen Görtler ◽  
Oliver Deussen ◽  
Miriah Meyer ◽  
Jeff Phillips ◽  
...  

Being able to capture or predict a user's intent behind a brush in a visualization tool has important implications in two scenarios. First, predicting intents can be used to auto-complete a partial selection in a mixed-initiative approach, with potential benefits to selection speed, correctness, and confidence. Second, capturing the intent of a selection can be used to improve recall, reproducibility, and even re-use. Augmenting provenance logs with semi-automatically captured intents makes it possible to save the reasoning behind selections. In this paper, we introduce a method to infer intent for selections and brushes in scatterplots. We first introduce a taxonomy of types of patterns that users might specify, which we elicited in a formative study conducted with professional data analysts and scientists. Based on this, we identify algorithms that can classify these patterns, and introduce various approaches to score the match of each pattern to an analyst's selection of items. We introduce a system that implements these methods for scatterplots and ranks alternative patterns against each other. Analysts then can use these predictions to auto-complete partial selections, and to conveniently capture their intent and provide annotations, thus making a concise representation of that intent available to be stored as provenance data. We evaluate our approach using interviews with domain experts and in a quantitative crowd-sourced study, in which we show that using auto-complete leads to improved selection accuracy for most types of patterns.


Sign in / Sign up

Export Citation Format

Share Document