Releases: dstackai/dstack
0.19.40
0.19.39
This release includes several important bug fixes, performance improvements, and updated documentation.
Documentation
Contributing
External contributors can now build the dstack documentation locally!
Previously, this wasn’t possible because dstack relied on the premium MKDocs Material Insider theme. Since the maintainers of MKDocs Material recently made the full edition free, we've updated the repository accordingly.
To help contributors get started, we've added a new guide: contributing/DOCS.md.
We welcome and encourage community contributions - whether that's reporting issues or submitting pull requests.
/llms.txt
As more users rely on LLMs to interact with tools, we've added support for /llms.txt to help guide models using dstack.
You can now use:
These files help LLMs better understand how dstack works and enable them to generate more accurate commands, configurations, and answers.
AGENTS.md
We’ve also added AGENTS.md to help modern AI agents automatically understand how to interact with the repository and tooling.
Warning
Be sure to update to 0.19.40, which includes an additional important fix.
What's changed
- [Experimental] Added
LEGACY_REPO_DIRfeature flag (to drop the legacy repo dir) by @un-def in #3305 - [UX] Minor improvement for
dstack serverby @peterschmidt85 in #3308 - [Bug] Fix
LEGACY_REPO_DIR_DISABLEDfeature flag by @un-def in #3314 - [Docs] Add
sglang_routerdetails in examples, gateway and refs by @Bihan in #3313 - [Docs] Add
llms.txtandllms-full.txtby @peterschmidt85 in #3312 - Assert not authenticated 401 by @r4victor in #3318
- [Performance] Optimize per fleet offers by @r4victor in #3316
- [Docs] Add
AGENTS.mdby @r4victor in #3319 - [Docs] Generate external links CSS decoration automatically by @peterschmidt85 in #3320
- [Blog] SGLang router integration and disaggregated inference roadmap by @peterschmidt85 in #3323
- [Bug] Implement users soft-deletion by @r4victor in #3326
- [Bug] Fix gateway router field backward compatibility by @Bihan in #3327
Full changelog: 0.19.38...0.19.39
0.19.38
Gateways
Routers
dstack gateways now integrate with SGLang Model Gateway, enabling inference request routing with policies such as cache_aware, power_of_two, round_robin, and random. You can enable it by setting the router property in your gateway configuration to sglang and select any of the available routing policies.
Example configuration:
type: gateway
name: sglang-gateway
backend: aws
region: eu-west-1
domain: example.com
router:
type: sglang
policy: cache_awareRead how the new router property works in the documentation.
Fleets
Run plan
Since 0.19.26 release, dstack has been provisioning instances according to configured fleets, but run plan offers didn’t reflect that — meaning you might not have seen the actual offers used for provisioning.
This has now been fixed, and the run plan shows offers that respect the configured fleets.
For example, you can create a fleet for provisioning spot GPU instances on AWS:
type: fleet
name: cloud-fleet
nodes: 0..
backends: [aws]
spot_policy: spot
resources:
gpu: 1..The run plan for submitted runs now shows offers that match the fleet configuration:
✗ dstack apply
...
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 aws (us-east-1) cpu=4 mem=16GB disk=100GB T4:16GB:1 g4dn.xlarge $0.526
2 aws (us-east-2) cpu=4 mem=16GB disk=100GB T4:16GB:1 g4dn.xlarge $0.526
3 aws (us-west-2) cpu=4 mem=16GB disk=100GB T4:16GB:1 g4dn.xlarge $0.526
...
Shown 3 of 309 offers, $71.552maxWhat's changed
- [Docs] Update to the latest
mkdocs-materialand add thecontributing/DOCS.mdby @peterschmidt85 in #3286 - [Docs] Describe some gateway options on Concepts/Gateways page by @un-def in #3287
- Expand max_duration reference by @r4victor in #3292
- [Docker] Fix ssh zombie processes issue by @un-def in #3295
- [Docs] Fix incorrect URLs by @peterschmidt85 in #3297
- [Blog] NVIDIA DGX Spark by @peterschmidt85 in #3298
- Return plan offers wrt fleets by @r4victor in #3300
- Show task nodes in run plan by @r4victor in #3301
- Log non-zero exit status in SSHTunnel.close/aclose by @un-def in #3296
- Fix in-place update when
filesare used by @un-def in #3289 - [Runpod] Require CUDA 12.8+ on the host by @peterschmidt85 in #3304
- Fix SSHAttach.detach() by @un-def in #3306
- Add SGLang Router Support by @Bihan in #3267
Full changelog: 0.19.37...0.19.38
0.19.37
CLI
dstack attach --logs --since
The dstack attach --logs command now supports a --since argument to show only the recent logs before following real-time logs. You can specify either a relative duration or an absolute timestamp:
# Show logs from a specific timestamp and follow real-time logs
> dstack attach my-task --logs --since 2025-11-05T08:54:15Z
# Show logs from the last 5 minutes and follow real-time logs
> dstack attach my-task --logs --since 5m
This is especially helpful for long-running services and trainings when you don't need to load the entire logs history.
Fleets
Placement groups for elastic fleets
Previously dstack set up interconnected clusters with placement groups only for fleets with static number of instances such as nodes: 8. Now instances provisioned in fleets with placement: cluster always use placement groups. So now you can use elastic fleets and get best connectivity:
type: fleet
name: cloud-fleet
placement: cluster
nodes: 0..Important change: Multi-node tasks can now run only on fleets with placement: cluster.
Backends
Gateways on Kubernetes
Previously gateways support on Kubernetes was limited to managed Kubernetes with DNS-based load balancers such as EKS. Now the support is extended to IP-based load balancers such as GKE and Nebius' mk8s.
What's Changed
- Support --since arg for dstack attach --logs command by @r4victor in #3268
- Fix instances disk in resources description by @r4victor in #3271
- [Internal] Add kubernetes type stubs by @un-def in #3272
- Forbid running multinode tasks on non-cluster fleets by @r4victor in #3277
- [Feature]: Make proxy service HTTP timeout configurable by @earandap in #3275
- [runner] Allow to compile runner on macOS (for development purposes only) by @peterschmidt85 in #3276
- Support placement groups for elastic fleets by @r4victor in #3282
- Kubernetes: change jump pod image, tune sshd options by @un-def in #3273
- Kubernetes: support IP based load balancers by @un-def in #3283
- Improved remote repo support by @peterschmidt85 in #3279
- Mention runpod in Clusters guide by @r4victor in #3284
- Improved remote repo support (#3279) by @peterschmidt85 in #3285
New Contributors
Full Changelog: 0.19.36...0.19.37
0.19.36
CLI
dstack ps
The output of dstack ps has been revamped to include colored statuses and a more compact resource view. Full resource details are still available in dstack ps --verbose.
dstack logs --since
The dstack logs command now supports a --since argument to show only recent logs. You can specify either a relative duration or an absolute timestamp:
# Show logs from a specific timestamp
> dstack logs logs-task --since 2025-11-05T08:54:15Z
# Show logs from the last 5 minutes
> dstack logs logs-task --since 5mKubernetes
Improved GPU allocation
The kubernetes backend now requests all available GPU types when scheduling jobs, instead of limiting to just the first available type. This enables more flexible scheduling in heterogeneous Kubernetes clusters with multiple GPU types.
Offers
Optional GPU requirements
When specifying GPU requirements with a lower bound of 0 (e.g., gpu: 0..8:24GB), dstack now includes non-GPU offers in addition to GPU instances. This allows for more flexible resource selection when GPU access is optional.
What's changed
- Fix ComputeGroupModel migration table lock order by @r4victor in #3244
- Add DSTACK_FF_AUTOCREATED_FLEETS_DISABLED by @r4victor in #3251
- [Nebius] Pre-build a Docker image with nebius CLI bundled by @peterschmidt85 in #3248
- Fleet-first docs by @r4victor in #3242
- [Docs] Fix typo in fleets.md comment by @antoniojtorres in #3255
- [CLI] Improve the output of
dstack psby @peterschmidt85 in #3253 - [chore]: Drop temporary
python-dxfpatch by @jvstme in #3245 - Support --since arg for dstack logs command by @r4victor in #3258
- Exclude requirements.multinode for client backward compatibility by @r4victor in #3262
- Kubernetes: request all suitable GPUs by @un-def in #3259
- [Bug]: Using "files" directive with an SSH Fleet will have dstack-runner consume all ram and hang by @peterschmidt85 in #3263
- Include non-GPU offers on
gpu: 0..by @jvstme in #3264
New Contributors
- @antoniojtorres made their first contribution in #3255
Full Changelog: 0.19.35...0.19.36
0.19.35
Runpod
Instant Clusters
dstack adds support for Runpod Instant Clusters enabling multi-node tasks on Runpod:
✗ dstack apply -f nccl-tests.dstack.yaml -b runpod
Project main
User admin
Configuration .dstack/confs/nccl-tests-simple.yaml
Type task
Resources cpu=2.. mem=8GB.. disk=100GB.. gpu:1..8
Spot policy auto
Max price -
Retry policy -
Creation policy reuse-or-create
Idle duration 5m
Max duration -
Reservation -
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 runpod (US-KS-2) cpu=128 mem=2008GB disk=100GB NVIDIA A100-SXM… $16.7…
A100:80GB:8
2 runpod (US-MO-1) cpu=128 mem=2008GB disk=100GB NVIDIA A100-SXM… $16.7…
A100:80GB:8
3 runpod cpu=160 mem=1504GB disk=100GB NVIDIA H100 80G… $25.8…
(CA-MTL-1) H100:80GB:8
...
Shown 3 of 5 offers, $34.464max
Submit the run nccl-tests? [y/n]:
Runpod offers clusters of 2 to 8 nodes with H200, B200, H100, and A100 GPUs and InfiniBand networking up to 3200 Gbps.
What's Changed
- Fix postgres migrations deadlocks by @r4victor in #3220
- Detect nvidia inside WSL2 by @r4victor in #3221
- Fix examples link in contributing doc by @matiasinsaurralde in #3228
- Fix context usage in internal metrics package by @matiasinsaurralde in #3226
- Fix working_dir compatibility with pre-0.19.27 clients by @un-def in #3231
- Fix autocreated fleets warning by @r4victor in #3233
- Improve Go code error handling by @r4victor in #3230
- Support Runpod Instant Clusters by @r4victor in #3214
- Switch to nebius sdk 0.3 by @r4victor in #3222
- Do not terminate fleet instances on idle_duration at nodes.min by @r4victor in #3235
- [shim] Log successful API calls with trace level by @un-def in #3237
- Drop hardcoded Nebius InfiniBand fabrics by @jvstme in #3234
- [runner] Clone repo before working dir is created by @un-def in #3240
New Contributors
- @matiasinsaurralde made their first contribution in #3228
Full Changelog: 0.19.34...0.19.35
0.19.34
UI
Scheduled runs
The Run details page now shows Schedule and Next time for scheduled runs.
Finished times
The Run and Job list and details pages now display Finished times.
Backends
GCP
GCP G4 instances with NVIDIA RTX PRO 6000 GPUs are now generally available:
> dstack offer -b gcp --gpu RTXPRO6000
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 gcp (us-central1) cpu=48 mem=180GB disk=100GB RTXPRO6000:96GB:1 g4-standard-48 $4.5001
2 gcp (us-central1) cpu=96 mem=360GB disk=100GB RTXPRO6000:96GB:2 g4-standard-96 $9.0002
3 gcp (us-central1) cpu=192 mem=720GB disk=100GB RTXPRO6000:96GB:4 g4-standard-192 $18.0003
4 gcp (us-central1) cpu=384 mem=1440GB disk=100GB RTXPRO6000:96GB:8 g4-standard-384 $36.0006Also, GCP A4 instances are supported via reservations.
Runs
SSH keys
dstack now uses server-managed user SSH keys when starting new runs. This allows users to attach to the run from different machines, since the SSH key is automatically replicated to all clients. User-supplied SSH keys are still used if specified explicitly.
Docs
Kubernetes
The Kubernetes backend docs now include a list of required permissions.
What's Changed
- Support getting job metrics by run_id by @r4victor in #3201
- Document Kubernetes required permissions by @r4victor in #3202
- Show Schedule and Next run on Run page by @r4victor in #3203
- Fix next_triggered_at extra fields not permitted by @r4victor in #3207
- Add AI Assistance Notice by @r4victor in #3208
- Add
InstanceOffer.backend_data+ reserved GCP A4 by @jvstme in #3209 - Force uppercase Runpod volume regions by @r4victor in #3217
- Show Finished for runs and jobs in the UI by @r4victor in #3218
- Use server-managed user SSH keys for new runs by @jvstme in #3216
- [GCP] Support G4 instance type GA by @peterschmidt85 in #3213
Full Changelog: 0.19.33...0.19.34
0.19.33
Kubernetes
AMD
The kubernetes backend now allows you to run workloads on AMD GPU-enabled Kubernetes clusters.
UI
Dev environments
You can now configure and provision dev environments directly from the user interface.
create-dev-environment-2.mp4
Note
CLI version 0.19.33 or later is required to attach to runs created from the UI.
GCP
Reservations
You can now configure specifically-targeted GCP reservations in fleet configurations to leverage reserved compute capacity:
type: fleet
nodes: 4
placement: cluster
backends: [gcp]
reservation: my-reservationFor reservations shared between projects, use the full syntax:
type: fleet
nodes: 4
placement: cluster
backends: [gcp]
reservation: projects/my-proj/reservations/my-reservationdstack will automatically locate the specified reservation, match offers to the reservation's properties, and provision instances within the reservation. If there are multiple reservations with the specified name, all of them will be considered for provisioning.
Note
Using reservations requires the compute.reservations.list permission in the project that owns the reservation.
G4 preview
If your GCP project has access to the preview G4 instance type, you can now try it out with dstack.
> dstack offer -b gcp --gpu RTXPRO6000
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 gcp (us-central1) cpu=48 mem=180GB disk=100GB RTXPRO6000:96GB:1 g4-standard-48 $0To use G4, enable its preview in the backend settings.
projects:
- name: main
backends:
- type: gcp
project_id: my-project
creds:
type: default
preview_features: [g4]Hot Aisle
The hotaisle backend now supports 8x MI300X instances too.
> dstack offer -b hotaisle --gpu 8:MI300X
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 hotaisle (us-michigan-1) cpu=104 mem=1792GB disk=12288GB MI300X:192GB:8 8x MI300X 104x Xeon Platinum 8470 $15.92Docker
Default image
The default Docker image now uses CUDA 12.8 (updated from 12.1).
What's changed
- Fix version descriptions on Sky index page by @jvstme in #3153
- [Internal] Minor
justimprovements by @peterschmidt85 in #3173 - [Docker] Update the CUDA version in the default Docker image to 12.8 (from 12.1) by @peterschmidt85 in #3166
- [HotAisle] 4x MI300X 52x Xeon Platinum 8470 is skipped by @peterschmidt85 in #3175
- [Internal] Updated amd-smi CI/CD by @peterschmidt85 in #3179
- Make kubeconfig filename optional in server/config.yml by @r4victor in #3189
- [HotAisle] Support 8xMI300X instances by @peterschmidt85 in #3188
- [GCP] Support G4 preview instance type by @peterschmidt85 in #3181
- [UX] Improved UX of the project settings CLI section by @peterschmidt85 in #3183
- [UI] A prototype of the "Connect" section to show on the running dev environment page (WIP) by @peterschmidt85 in #3184
- Support GCP reservations by @jvstme in #3186
- [Feature]: Store user SSH key on the server by @peterschmidt85 in #3176
- [Internal] Extend
_CREATE_USER_HOOKSwith an optional config by @peterschmidt85 in #3192 - Run wizzard by @olgenn in #3191
- fix(VastAICompute): filter region before offer by @DragonStuff in #3193
- Kubernetes: add AMD GPU support by @un-def in #3178
- Fix CLI incompatibility with older
/get_my_userby @jvstme in #3198
Full changelog: 0.19.32...0.19.33
0.19.32
Fleets
Nodes
Maximum number of nodes
The fleet nodes.max property is now respected that allows limiting maximum number of instances allowed in a fleet. For example, to allow at most 10 instances in the fleet, you can do:
type: fleet
name: cloud-fleet
nodes: 0..10A fleet will be considered for a run only if the run can fit into the fleet without violating nodes.max. If you don't need to enforce an upper limit, you can omit it:
type: fleet
name: cloud-fleet
nodes: 0..Backends
Nebius
Tags
Nebius backend now supports backend and resource-level tags to tag cloud resources provisioned via dstack:
type: nebius
creds:
type: service_account
# ...
tags:
team: my_team
user: jakeCredentials file
It's also possible to configure the nebius backend using a credentials file generated by the nebius CLI:
nebius iam auth-public-key generate \
--service-account-id <service account ID> \
--output ~/.nebius/sa-credentials.jsonprojects:
- name: main
backends:
- type: nebius
creds:
type: service_account
filename: ~/.nebius/sa-credentials.jsonHot Aisle
Hot Aisle backend now supports multi-GPU VMs such as 2xMI300X and 4xMI300X.
dstack apply -f .local/.dstack.yml --gpu amd:2
The working_dir is not set — using legacy default "/workflow". Future versions will default to the
image's working directory.
# BACKEND RESOURCES INSTANCE TYPE PRICE
1 hotaisle cpu=26 mem=448GB disk=12288GB 2x MI300X 26x Xeon… $3.98
(us-michigan-1) MI300X:192GB:2What's changed
- Fix CLI compatibility with server 0.19.11 by @jvstme in #3145
- [Feature]: Nebius switch to using
nebius iam auth-public-key generateby @peterschmidt85 in #3147 - [Docs] Move
PluginstoReference|Python APIby @peterschmidt85 in #3148 - 404 error on GIT url by @robinnarsinghranabhat in #3149
- Fix idle duration: off and forbid negative durations by @r4victor in #3151
- [Docs]: GCP A4 cluster example by @jvstme in #3152
- Consider multinode replica inactive only if all jobs done by @r4victor in #3157
- Kubernetes: add NVIDIA GPU toleration by @un-def in #3160
- [Nebius] Support tags by @peterschmidt85 in #3158
- [Hot Aisle] Support multi-GPU VMs by @peterschmidt85 in #3154
- feat(docker): upgrade litestream to v0.5.0 by @DragonStuff in #3165
- [Blog] Orchestrating GPU workloads on Kubernetes by @peterschmidt85 in #3161
- Respect fleet
nodes.maxby @r4victor in #3164 - Fix kubeconfig via data reference by @r4victor in #3170
- [Docs] Fix kubernetes typos by @svanzoest in #3169
New contributors
- @robinnarsinghranabhat made their first contribution in #3149
- @DragonStuff made their first contribution in #3165
- @svanzoest made their first contribution in #3169
Full changelog: 0.19.31...0.19.32
0.19.31
Kubernetes
The kubernetes backend introduces many significant improvements and has now graduated from alpha to beta. It is much more stable and can be reliably used on GPU clusters for all kinds of workloads, including distributed tasks.
Here's what changed:
- Resource allocation now fully respects the user’s
resourcesspecification. Previously, it ignored certain aspects, especially the proper selection of GPU labels according to the specifiedgpuspec. - Distributed tasks now fully work on Kubernetes clusters with fast interconnect enabled. Previously, this caused many issues.
- Added support
privileged.
We’ve also published a dedicated guide on how to get started with dstack on Kubernetes, highlighting important nuances.
Warning
Be aware of breaking changes if you used the kubernetes backend before. The following properties in the Kubernetes backend configuration have been renamed:
networking→proxy_jumpssh_host→hostnamessh_port→port
Additionally, the "proxy jump" pod and service names now include a dstack- prefix.
GCP
A4 spot instances with B200 GPUs
The gcp backend now supports A4 spot instances equipped with B200 GPUs. This includes provisioning both standalone A4 instances and A4 clusters with high-performance RoCE networking.
To use A4 clusters with high-performance networking, you must configure multiple VPCs in your backend settings (~/.dstack/server/config.yml):
projects:
- name: main
backends:
- type: gcp
project_id: my-project
creds:
type: default
vpc_name: my-vpc-0 # regular, 1 subnet
extra_vpcs:
- my-vpc-1 # regular, 1 subnet
roce_vpcs:
- my-vpc-mrdma # RoCE profile, 8 subnetsThen, provision a cluster using a fleet configuration:
type: fleet
nodes: 2
placement: cluster
availability_zones: [us-west2-c]
backends: [gcp]
spot_policy: spot
resources:
gpu: B200:8Each instance in the cluster will have 10 network interfaces: 1 regular interface in the main VPC, 1 regular interface in the extra VPC, and 8 RDMA interfaces in the RoCE VPC.
Note
Currently, the gcp backend only supports A4 spot instances. Support for other options, such as flex and calendar scheduling via Dynamic Workload Scheduler, is coming soon.
CLI
dstack project is now faster
The USER column in dstack project list is now shown only when the --verbose flag is used.
This significantly improves performance for users with many configured projects, reducing execution time from ~20 seconds to as little as 2 seconds in some cases.
What's changed
- [Kubernetes] Request resources according to
RequirementsSpecby @un-def in #3127 - [GCP] Support A4 spot instances with the B200 GPU by @jvstme in #3100
- [CLI] Move
USERtodstack project list --verboseby @jvstme in #3134 - [Kubernetes] Configure
/dev/shmif requested by @un-def in #3135 - [Backward incompatible] Rename properties in Kubernetes backend config by @un-def in #3137
- Support GCP A4 clusters by @jvstme in #3142
- Kubernetes: add multi-node support by @un-def in #3141
- Fix duplicate server log messages by @jvstme in #3143
- [Docs] Improve Kubernetes documentation by @peterschmidt85 in #3138
Full changelog: 0.19.30...0.19.31