Skip to content

Conversation

@jeffbolznv
Copy link
Collaborator

@jeffbolznv jeffbolznv commented Jan 20, 2026

We had a bug where a set_tensor_async (using transfer_ctx) didn't get submitted before the graph_compute (using compute_ctx) that came after it. To avoid this sort of issue, just do everything in compute_ctx.

Remove transfer_cmd_pool, which was already unused.

Fixes #18786.

@jeffbolznv jeffbolznv requested a review from 0cc4m as a code owner January 20, 2026 02:47
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jan 20, 2026
@jeffbolznv
Copy link
Collaborator Author

llama-embedding test is failing in CI, I assume this means the output is corrupt somehow. I'll look at it closer tomorrow.

We had a bug where a set_tensor_async (using transfer_ctx) didn't get
submitted before the graph_compute (using compute_ctx) that came after
it. To avoid this sort of issue, just do everything in compute_ctx.

Remove transfer_cmd_pool, which was already unused.
@0cc4m
Copy link
Collaborator

0cc4m commented Jan 20, 2026

Isn't there an advantage on certain hardware to using a transfer-specific queue? Maybe there's a way to solve this with synchronization between the queues.

@jeffbolznv
Copy link
Collaborator Author

If the transfer queue can run in parallel with the compute work then that's potentially an advantage, but if we'd need to sync them against each other with semaphores to preserve the ordering that ggml expects, then we wouldn't be able to get that. And these operations were already on the compute queue (being created from compute_cmd_pool) so that hasn't changed. The commands in the ggml_backend_buffer_i interface still use the transfer queue.

@jeffbolznv jeffbolznv marked this pull request as draft January 20, 2026 14:57
@jeffbolznv
Copy link
Collaborator Author

Hmm, I'm seeing a crash somewhere, set to draft for now.

@jeffbolznv jeffbolznv marked this pull request as ready for review January 20, 2026 15:06
@jeffbolznv
Copy link
Collaborator Author

The crash was with the perf_logger enabled, fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: GGML_ASSERT(id >= 0 && id < n_expert) when running GPT-OSS 20B at any quantization and with any flags

2 participants