Temporal Workflow Engine PoC
Proposers
- @arndtd Daniel Arndt (Canonical)
- @aticig Gülsüm Atici (Canonical)
- @beierlm Mark Beierl (Canonical)
- @calvinosanc1 Guillermo Calviño (Canonical)
- @faccind Dario Faccin (Canonical)
- @reinosop Patricia Reinoso (Canonical)
Description
OSM has task management that is used for performing operations, such as deploying a Network Service, creating a VIM account, deploying VM VDUs, etc. Each of these tasks are managed in different ways, for example LCM and RO have different implementations and bugs might need to be fixed twice. The existing task management needs improvement in the following areas:
- Recovery from failure (pod crash)
- Timeouts
- Scale (only 1 unit is active right now)
- Coordination of tasks (LCM deploy calls RO to deploy VMs, but the two tasks have no coordination other than in the LCM's working memory)
- Cancellation of tasks
Rather than taking on the full responsibility for designing and developing a robust framework, this feature proposes the use of Temporal as a task management framework.
Temporal is a scalable and reliable runtime for Reentrant Processes called Temporal Workflow Executions. The Temporal Platform consists of a Temporal Cluster and Worker Processes. Together these components create a runtime for Workflow Executions. A Temporal Application is a set of Temporal Workflow Executions. Each Temporal Workflow Execution has exclusive access to its local state, executes concurrently to all other Workflow Executions, and communicates with other Workflow Executions and the environment via message passing.
In order to use Temporal, the upstream Temporal OCI is deployed to the OSM namespace. It uses the existing MySQL container as its datastore, and the various OSM modules can register as clients or workers.
https://docs.temporal.io/temporal
Demo or definition of done
Proof of Concept demonstrating some OSM operations as Workflows.
Design Documentation
Instead of Etherpad, we are going to use GitLab wiki to track the designs.