-
Notifications
You must be signed in to change notification settings - Fork 95
Description
When we spawn a task, we currently check the DB to retrieve previous submissions. This is required for:
- Flow merge detection.
- Task re-run prevention (within the same flow).
- Preventing the accidental reincarnation of removed tasks.
Whilst we do check for previous submissions, we do not presently check task prerequisites. In some situations, task prerequisites which are satisfied in the DB, are left unsatisfied in the task pool when tasks are spawned. This can happen when tasks are added to the pool via means other than natural task satisfaction (e.g. #5952).
Easy way to replicate this:
- Empty the task pool.
- Put the tasks back.
- Run the workflow on.
Sadly, I don't this it is sufficient to check the DB only when such interventions are performed, because tasks downstream of the ones being added which are yet to be spawned may also be missing this state and end up with (erroneously) partially satisfied prerequisites.
This is a bad bug as it makes it look to the user like tasks which have run, haven't. It's hard to explain, especially as multiple Cylc interfaces will provide erroneous information. Moreover it's very hard to recover from as the consequences may last as long as the longest inter-cycle dependency in the workflow.
There are efficiency concerns over requesting task prerequisites, however, we haven't ascertained that this would be a problem as yet, however, DB processing is definitely a bottleneck. One way to reduce these overheads might be to only request satisfied prerequisites from the DB. Merging this request in with the existing DB request would also improve performance. As would batching the requests where multiple tasks are spawned in the same main loop iteration.
This issue has relevance to workflow extension use cases (#5952), graph changes either by reload or restart