Skip to content

ServingRuntime autoscaling monitoring GPU utilization #372

@andreapairon

Description

@andreapairon

Hi all,

I noticed in the scaling doc page (https://github.com/kserve/modelmesh-serving/blob/main/docs/production-use/scaling.md) that now is possible to set the ServingRuntime autoscaling with HPA, but using metrics based on cpu utilization.
Is it possible to scale the ServingRuntime using metrics regarding GPU?

Thanks in advance

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions