vulkan: Remove transfer_ctx, do everything in compute_ctx. #18945

jeffbolznv · 2026-01-20T02:47:27Z

We had a bug where a set_tensor_async (using transfer_ctx) didn't get submitted before the graph_compute (using compute_ctx) that came after it. To avoid this sort of issue, just do everything in compute_ctx.

Remove transfer_cmd_pool, which was already unused.

Fixes #18786.

jeffbolznv · 2026-01-20T05:01:17Z

llama-embedding test is failing in CI, I assume this means the output is corrupt somehow. I'll look at it closer tomorrow.

We had a bug where a set_tensor_async (using transfer_ctx) didn't get submitted before the graph_compute (using compute_ctx) that came after it. To avoid this sort of issue, just do everything in compute_ctx. Remove transfer_cmd_pool, which was already unused.

0cc4m · 2026-01-20T14:25:28Z

Isn't there an advantage on certain hardware to using a transfer-specific queue? Maybe there's a way to solve this with synchronization between the queues.

jeffbolznv · 2026-01-20T14:50:27Z

If the transfer queue can run in parallel with the compute work then that's potentially an advantage, but if we'd need to sync them against each other with semaphores to preserve the ordering that ggml expects, then we wouldn't be able to get that. And these operations were already on the compute queue (being created from compute_cmd_pool) so that hasn't changed. The commands in the ggml_backend_buffer_i interface still use the transfer queue.

jeffbolznv · 2026-01-20T14:57:17Z

Hmm, I'm seeing a crash somewhere, set to draft for now.

jeffbolznv · 2026-01-20T15:07:02Z

The crash was with the perf_logger enabled, fixed now.

jeffbolznv requested a review from 0cc4m as a code owner January 20, 2026 02:47

github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jan 20, 2026

loci-dev mentioned this pull request Jan 20, 2026

UPSTREAM PR #18945: vulkan: Remove transfer_ctx, do everything in compute_ctx. auroralabs-loci/llama.cpp#974

Open

jeffbolznv force-pushed the issue_18786 branch from 8972611 to e93b5b4 Compare January 20, 2026 06:44

jeffbolznv marked this pull request as draft January 20, 2026 14:57

fix crash with perf logger

5a08666

jeffbolznv marked this pull request as ready for review January 20, 2026 15:06

jeffbolznv mentioned this pull request Jan 20, 2026

Eval bug: llama_model_load: error loading model: vk::Queue::submit: ErrorDeviceLost #18962

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vulkan: Remove transfer_ctx, do everything in compute_ctx. #18945

vulkan: Remove transfer_ctx, do everything in compute_ctx. #18945

jeffbolznv commented Jan 20, 2026 •

edited

Loading

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

0cc4m commented Jan 20, 2026

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vulkan: Remove transfer_ctx, do everything in compute_ctx. #18945

Are you sure you want to change the base?

vulkan: Remove transfer_ctx, do everything in compute_ctx. #18945

Conversation

jeffbolznv commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

0cc4m commented Jan 20, 2026

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

jeffbolznv commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jeffbolznv commented Jan 20, 2026 •

edited

Loading