Skip to content

Conversation

@jssblck
Copy link
Contributor

@jssblck jssblck commented Apr 20, 2023

Overview

When running and setting up demos, I noted that intermittent issues (e.g. temporary FOSSA 503 errors) would cause jobs to fail, which would then fail out the entire Broker runtime.

We ideally need a job-level retry mechanism, but that's more complicated; ticket here. Instead, here, we just manually use a retry function to retry the portions of broker run that aren't simply queue based operations.

Note: this PR is not complete; I kind of am working on it ad-hoc if I get extra time.

Acceptance criteria

Broker can recover from temporary issues.

Testing plan

Unfortunately we don't have great tests here.

I set up echotraffic to proxy FOSSA. I then configured Broker to use it.
I then started running Broker, and killed the proxy.
I noted that Broker successfully attempts to retry uploads instead of just dying.
I then restarted the proxy, and noted that uploads work again.

Risks

This imposes more complexity in broker run.

References

No ticket; observed during demos and was quick to fix.

Checklist

  • I added tests for this PR's change (or explained in the PR description why tests don't make sense).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant