-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Hi,
I am looking into reducing the size of the Docker image for the Triton server. I have built an image containing only the backend I am using (tensorflow) using the compose script, as is described here. This already helps quite a bit.
The base image used by the generated Dockerfile is nvcr.io/nvidia/tritonserver:25.02-py3-min, I noticed this image is already 10GB, most of which comes from CUDA. Among other things, the TensorRT runtime is installed. This is not required for my setup, so I was wondering if there is a way to rebuild the image without the TensorRT runtime.
Is the Dockerfile for the nvcr.io/nvidia/tritonserver:25.02-py3-min image available somewhere so that I can patch it for this purpose myself, or is there some other recommended way to accomplish the same?