Make Vqueue item state transition dependent on InvocationStatus #4072

tillrohrmann · 2025-11-28T09:29:11Z

We cannot unconditionally try to move VQueue items to running via
VQueues::attempt_to_run because this updates the internal VQueueMeta's
num_waiting field. The num_waiting field needs to be consistent to say
whether a vqueue still has waiting items or not. Moreover, if we remove
items from waiting twice (as it could happen before when trying to run
a vqueue item just after it got canceled), then this can lead to panics
because of underflows.

To solve this problem, we only attempt to run a vqueue item if its
InvocationStatus indicates that it is still inboxed or scheduled.

This PR is based on #4071

The vqueue items table is intended to store the payload of a vqueue item (e.g. the invocation or the state mutation). The item is stored when it is being enqueued into the vqueue and removed once the vqueue item ends. The payload can be accessed by the tuple (qid, created_at, kind, entry_id). The creation timestamp is part of the key in order to establish the insertion order into the given vqueue.

This commit integrates vqueues with external state mutations. The way it works is by storing the external state mutation in the VQueueItems table until it gets run by the scheduler. When running the state mutation, we retrieve it from the VQueueItems table and apply the state mutation changes. The external state mutation EntryId is currently derived from the internal inbox_seq_number to generate unique EntryId's. In the future we might give external state mutations an explicit id which could be used to derive the EntryId. Currently, the DRR scheduler reserves a concurrency token for running state mutations. We might want to relax this condition if we see that applying state mutations does not need to be protected by the global concurrency limit.

Before, the DRR scheduler checked for eligibility if the current head item was removed. This was not enough since a vqueue might have been not eligible because it was running out of concurrency tokens. A removed item could have released the concurrency token and thereby it could make the vqueue eligible again if the vqueue contains more items. Hence, we should check on every removed item whether this event had an impact on the vqueue's eligibility.

Since we calculate the partition key for VOs and workflows only from the key, we need to include the VO/workflow name when creating the VQueueId (for generating an instance id which does not conflict with other VOs/ workflows). This is a bit unfortunate :-(

…ue meta When applying Action::Complete to the vqueue meta we need to consider what's the current stage to properly update the statistics and the metadata. Hence, we are now passing the stage instead of whether it was in the inbox or not because we need to handle `Stage::Run` and `Stage::Park` differently.

We cannot unconditionally try to move VQueue items to running via VQueues::attempt_to_run because this updates the internal VQueueMeta's num_waiting field. The num_waiting field needs to be consistent to say whether a vqueue still has waiting items or not. Moreover, if we remove items from waiting twice (as it could happen before when trying to run a vqueue item just after it got canceled), then this can lead to panics because of underflows. To solve this problem, we only attempt to run a vqueue item if its InvocationStatus indicates that it is still inboxed or scheduled.

When resuming an invocation using the vqueue scheduler we cannot go directly into the InvocationStatus::Invoked since we first need to get scheduled again. As part of the scheduling the system needs to go through a few state changes that depend on the InvocationStatus. Hence, this commit sets the InvocationStatus to Inboxed when moving an item back into the inbox stage.

tillrohrmann · 2025-11-28T19:39:14Z

The change to go back to InvocationStatus::Inboxed, has a few problems because we now need to handle appending completions when being inboxed. Also cancellation now needs to treat inboxed items differently based on whether the invocation was started or not. It is as if we need a new status (inboxed but executed before). I need to push a follow-up commit to address these problems.

…tion when using vqueues While it is correct that we should go from InvocationStatus::Invoked to ::Inboxed when resuming an invocation using vqueues, it is currently supported by the state machine. For it to be fully supported, we need to support handling journal completions when being ::Inboxed. Morever, we need to handle cancellations of ::Inboxed items depending on whether the invocation was running before or not. Additionally, we need to support the hotfix_apply_cancellation_after_deployment_is_pinned when going from ::Suspended to ::Inbox, for example. Until this is done, we stay in ::Invoked when resuming an invocation even when using vqueues.

tillrohrmann · 2025-11-29T21:09:50Z

I've pushed a temporary fix where we transition to InvocationStatus::Invoked instead of ::Inboxed when resuming an invocation (using vqueues). This should be changed once we properly support going from ::Suspended/Invoked to ::Inboxed and still support handling of journal completions and cancellations. With this fix, all of our sdk-test-suite e2e tests are passing.

tillrohrmann added 6 commits November 26, 2025 23:09

tillrohrmann requested a review from AhmedSoliman November 28, 2025 09:29

tillrohrmann force-pushed the vqueues/fix-meta-consistency branch from 2189738 to 1a8c5ad Compare November 28, 2025 11:32

tillrohrmann force-pushed the vqueues/fix-meta-consistency branch from 1a8c5ad to 86d4d05 Compare November 28, 2025 11:47

tillrohrmann mentioned this pull request Nov 29, 2025

Enable e2e and sdk tests for vqueues #4052

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make Vqueue item state transition dependent on InvocationStatus #4072

Make Vqueue item state transition dependent on InvocationStatus #4072

Uh oh!

tillrohrmann commented Nov 28, 2025

Uh oh!

tillrohrmann commented Nov 28, 2025

Uh oh!

tillrohrmann commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Make Vqueue item state transition dependent on InvocationStatus #4072

Are you sure you want to change the base?

Make Vqueue item state transition dependent on InvocationStatus #4072

Uh oh!

Conversation

tillrohrmann commented Nov 28, 2025

Uh oh!

tillrohrmann commented Nov 28, 2025

Uh oh!

tillrohrmann commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant