-
Notifications
You must be signed in to change notification settings - Fork 653
Open
Labels
Description
Search before asking
- I had searched in the issues and found no similar feature requirement.
Description
TL;DR: during zero down time RayService upgrade, allow drain configurations to continue serving the ongoing requests upto a custom timeout.
Scenario:
- Rayservice is running and serving traffic with image v1
- send an update request to upgrade to image v2
- this would create a separate raycluster rc-2 and point the rayservice to rc-2 once the rc-2 becomes ready.
- during this time rayservice will continue to serve requests through current raycluster: rc-1
- however, there is a fixed shutdown period for rc-1 which would fail any ongoing requests on rc-1
Repro:
- a ray-serve which exposes a fast api endpoint to accept seconds and runs a long running task for that many seconds.
- a script which sends multiple request specifying different times for each: [30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 180, 200]
- sending long running requests to the running rayservice
- update the rayservice to use a different version of my test docker image
- forces a new raycluster creation.
- observe the ongoing requests - they should not fail. [they fail now and there is no way to configure timeout to keep the rc-1 around.]
Use case
This is applicable for scenarios where Ray Serve is hosting applications which have long running requests [>30s or >60s]
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!