Retry task queue operations #69
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
When running and setting up demos, I noted that intermittent issues (e.g. temporary FOSSA 503 errors) would cause jobs to fail, which would then fail out the entire Broker runtime.
We ideally need a job-level retry mechanism, but that's more complicated; ticket here. Instead, here, we just manually use a
retryfunction to retry the portions ofbroker runthat aren't simply queue based operations.Note: this PR is not complete; I kind of am working on it ad-hoc if I get extra time.
Acceptance criteria
Broker can recover from temporary issues.
Testing plan
Unfortunately we don't have great tests here.
I set up echotraffic to proxy FOSSA. I then configured Broker to use it.
I then started running Broker, and killed the proxy.
I noted that Broker successfully attempts to retry uploads instead of just dying.
I then restarted the proxy, and noted that uploads work again.
Risks
This imposes more complexity in
broker run.References
No ticket; observed during demos and was quick to fix.
Checklist