Please use this identifier to cite or link to this item: http://hdl.handle.net/2122/16036
Authors: Spinuso, Alessandro* 
Atkinson, Malcolm* 
Magnoni, Federica* 
Title: S-ProvFlow. Storing and Exploring Lineage Data as a Service
Journal: Data Intelligence 
Series/Report no.: 2/4 (2022)
Publisher: mit press
Issue Date: 2022
DOI: 10.1162/dint_a_00128
Abstract: We present a set of configurable Web service and interactive tools, s-ProvFlow, for managing and exploiting records tracking data lineage during workflow runs. It facilitates detailed analysis of single executions. It helps users manage complex tasks by exposing the relationships between data, people, equipment and workflow runs intended to combine productively. Its logical model extends the PROV standard to precisely record parallel data-streaming applications. Its metadata handling encourages users to capture the application context by specifying how application attributes, often using standard vocabularies, should be added. These metadata records immediately help productivity as the interactive tools support their use in selection and bulk operations. Users rapidly appreciate the power of the encoded semantics as they reap the benefits. This improves the quality of provenance for users and management. Which in turn facilitates analysis of collections of runs, enabling users to manage results and validate procedures. It fosters reuse of data and methods and facilitates diagnostic investigations and optimisations. We present S-ProvFlow’s use by scientists, research engineers and managers as part of the DARE hyper-platform as they create, validate and use their data-driven scientific workflows.
Appears in Collections:Article published / in press

Files in This Item:
File Description SizeFormat
6-Alessandro Spinuso(1).pdfOpen Access published article1.42 MBAdobe PDFView/Open
Show full item record

Page view(s)

37
checked on Apr 13, 2024

Download(s)

10
checked on Apr 13, 2024

Google ScholarTM

Check

Altmetric