Options
Enabling Dynamic and Intelligent Workflows for HPC, Data Analytics, and AI Convergence
Author(s)
Language
English
Obiettivo Specifico
8T. Sismologia in tempo reale e Early Warning Sismico e da Tsunami
4V. Processi pre-eruttivi
3IT. Calcolo scientifico
Status
Published
JCR Journal
JCR Journal
Peer review journal
Yes
Title of the book
Issue/vol(year)
/134 (2022)
ISSN
0167-739X
Publisher
Elsevier
Pages (printed)
414-429
Issued date
April 20, 2022
Last version
https://arxiv.org/pdf/2204.09287.pdf
Abstract
The evolution of High-Performance Computing (HPC) platforms enables the
design and execution of progressively larger and more complex workflow
applications in these systems. The complexity comes not only from the number of
elements that compose the workflows but also from the type of computations they
perform. While traditional HPC workflows target simulations and modelling of
physical phenomena, current needs require in addition data analytics (DA) and
artificial intelligence (AI) tasks. However, the development of these workflows
is hampered by the lack of proper programming models and environments that
support the integration of HPC, DA, and AI, as well as the lack of tools to
easily deploy and execute the workflows in HPC systems. To progress in this
direction, this paper presents use cases where complex workflows are required
and investigates the main issues to be addressed for the HPC/DA/AI convergence.
Based on this study, the paper identifies the challenges of a new workflow
platform to manage complex workflows. Finally, it proposes a development
approach for such a workflow platform addressing these challenges in two
directions: first, by defining a software stack that provides the
functionalities to manage these complex workflows; and second, by proposing the
HPC Workflow as a Service (HPCWaaS) paradigm, which leverages the software
stack to facilitate the reusability of complex workflows in federated HPC
infrastructures. Proposals presented in this work are subject to study and
development as part of the EuroHPC eFlows4HPC project.
design and execution of progressively larger and more complex workflow
applications in these systems. The complexity comes not only from the number of
elements that compose the workflows but also from the type of computations they
perform. While traditional HPC workflows target simulations and modelling of
physical phenomena, current needs require in addition data analytics (DA) and
artificial intelligence (AI) tasks. However, the development of these workflows
is hampered by the lack of proper programming models and environments that
support the integration of HPC, DA, and AI, as well as the lack of tools to
easily deploy and execute the workflows in HPC systems. To progress in this
direction, this paper presents use cases where complex workflows are required
and investigates the main issues to be addressed for the HPC/DA/AI convergence.
Based on this study, the paper identifies the challenges of a new workflow
platform to manage complex workflows. Finally, it proposes a development
approach for such a workflow platform addressing these challenges in two
directions: first, by defining a software stack that provides the
functionalities to manage these complex workflows; and second, by proposing the
HPC Workflow as a Service (HPCWaaS) paradigm, which leverages the software
stack to facilitate the reusability of complex workflows in federated HPC
infrastructures. Proposals presented in this work are subject to study and
development as part of the EuroHPC eFlows4HPC project.
Type
article