Skip to content

[Feature] Drain parameters for RayService #4126

@bhumik145

Description

@bhumik145

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

TL;DR: during zero down time RayService upgrade, allow drain configurations to continue serving the ongoing requests upto a custom timeout.

Scenario:

  • Rayservice is running and serving traffic with image v1
  • send an update request to upgrade to image v2
  • this would create a separate raycluster rc-2 and point the rayservice to rc-2 once the rc-2 becomes ready.
  • during this time rayservice will continue to serve requests through current raycluster: rc-1
  • however, there is a fixed shutdown period for rc-1 which would fail any ongoing requests on rc-1

Repro:

  • a ray-serve which exposes a fast api endpoint to accept seconds and runs a long running task for that many seconds.
  • a script which sends multiple request specifying different times for each: [30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 180, 200]
  • sending long running requests to the running rayservice
  • update the rayservice to use a different version of my test docker image
  • forces a new raycluster creation.
  • observe the ongoing requests - they should not fail. [they fail now and there is no way to configure timeout to keep the rc-1 around.]

Use case

This is applicable for scenarios where Ray Serve is hosting applications which have long running requests [>30s or >60s]

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions