-
Notifications
You must be signed in to change notification settings - Fork 112
Make Vqueue item state transition dependent on InvocationStatus #4072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Make Vqueue item state transition dependent on InvocationStatus #4072
Conversation
The vqueue items table is intended to store the payload of a vqueue item (e.g. the invocation or the state mutation). The item is stored when it is being enqueued into the vqueue and removed once the vqueue item ends. The payload can be accessed by the tuple (qid, created_at, kind, entry_id). The creation timestamp is part of the key in order to establish the insertion order into the given vqueue.
This commit integrates vqueues with external state mutations. The way it works is by storing the external state mutation in the VQueueItems table until it gets run by the scheduler. When running the state mutation, we retrieve it from the VQueueItems table and apply the state mutation changes. The external state mutation EntryId is currently derived from the internal inbox_seq_number to generate unique EntryId's. In the future we might give external state mutations an explicit id which could be used to derive the EntryId. Currently, the DRR scheduler reserves a concurrency token for running state mutations. We might want to relax this condition if we see that applying state mutations does not need to be protected by the global concurrency limit.
Before, the DRR scheduler checked for eligibility if the current head item was removed. This was not enough since a vqueue might have been not eligible because it was running out of concurrency tokens. A removed item could have released the concurrency token and thereby it could make the vqueue eligible again if the vqueue contains more items. Hence, we should check on every removed item whether this event had an impact on the vqueue's eligibility.
Since we calculate the partition key for VOs and workflows only from the key, we need to include the VO/workflow name when creating the VQueueId (for generating an instance id which does not conflict with other VOs/ workflows). This is a bit unfortunate :-(
…ue meta When applying Action::Complete to the vqueue meta we need to consider what's the current stage to properly update the statistics and the metadata. Hence, we are now passing the stage instead of whether it was in the inbox or not because we need to handle `Stage::Run` and `Stage::Park` differently.
We cannot unconditionally try to move VQueue items to running via VQueues::attempt_to_run because this updates the internal VQueueMeta's num_waiting field. The num_waiting field needs to be consistent to say whether a vqueue still has waiting items or not. Moreover, if we remove items from waiting twice (as it could happen before when trying to run a vqueue item just after it got canceled), then this can lead to panics because of underflows. To solve this problem, we only attempt to run a vqueue item if its InvocationStatus indicates that it is still inboxed or scheduled.
2189738 to
1a8c5ad
Compare
When resuming an invocation using the vqueue scheduler we cannot go directly into the InvocationStatus::Invoked since we first need to get scheduled again. As part of the scheduling the system needs to go through a few state changes that depend on the InvocationStatus. Hence, this commit sets the InvocationStatus to Inboxed when moving an item back into the inbox stage.
1a8c5ad to
86d4d05
Compare
|
The change to go back to |
…tion when using vqueues While it is correct that we should go from InvocationStatus::Invoked to ::Inboxed when resuming an invocation using vqueues, it is currently supported by the state machine. For it to be fully supported, we need to support handling journal completions when being ::Inboxed. Morever, we need to handle cancellations of ::Inboxed items depending on whether the invocation was running before or not. Additionally, we need to support the hotfix_apply_cancellation_after_deployment_is_pinned when going from ::Suspended to ::Inbox, for example. Until this is done, we stay in ::Invoked when resuming an invocation even when using vqueues.
|
I've pushed a temporary fix where we transition to |
We cannot unconditionally try to move VQueue items to running via
VQueues::attempt_to_run because this updates the internal VQueueMeta's
num_waiting field. The num_waiting field needs to be consistent to say
whether a vqueue still has waiting items or not. Moreover, if we remove
items from waiting twice (as it could happen before when trying to run
a vqueue item just after it got canceled), then this can lead to panics
because of underflows.
To solve this problem, we only attempt to run a vqueue item if its
InvocationStatus indicates that it is still inboxed or scheduled.
This PR is based on #4071