feat(ROCm): Allow selecting ROCm version when building llama.cpp backend #7615

sredman · 2025-12-16T19:27:24Z

Description

This PR implements the ability to select which version of ROCm to use when building llama.cpp, in a similar fashion to how we are able to select which version of CUDA to use.

Notes for Reviewers

The motivation is that perfectly usable AMD cards are unsupported by newer ROCm versions. The Vulkan build works excellently on these cards but is not good for distributed inference as it cannot do parallel processing, resulting in sequential layer processing, according to the docs and my testing. Some say it is also not as performant.

I have some questions about how to finish this PR and I have not yet been able to test the actual backeds in my setup, thus leaving this PR as draft for now.

I have built with both version 6.3.3 and 7.1.1 (latest). I will only be able to tests with 6.3.3 since I only have Vega 10 GPUs available.

Some questions for maintainers:

Should I include these flags in the base-level Dockerfile? Since backends are now built/shipped separately, I assume the usage in the base-level Dockerfile are no longer necessary.
Should I make similar changes to the other backend Dockerfiles? I likely will only be able to test llama-cpp.
Would you accept changing the github pipelines to build a v6 and v7 version of ROCm, similar to how we have a v11 and v12 build for CUDA?
- Assuming the answer to the above question is "yes", would it be OK to make a breaking change to the ROCm backend name/tag, to better match the CUDA build? So we'd have something like *-gpu-amd-rocm-6-llama-cpp and *-gpu-amd-rocm-7-llama-cpp.

Signed commits

Yes, I signed my commits.

netlify · 2025-12-16T19:27:29Z

✅ Deploy Preview for localai ready!

Name	Link
🔨 Latest commit	`4941f13`
🔍 Latest deploy log	https://app.netlify.com/projects/localai/deploys/694865145d40560008c91d7c
😎 Deploy Preview	https://deploy-preview-7615--localai.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

sredman · 2025-12-17T01:48:53Z

Unfortunately my testing for this has been unsuccessful. LocalAI loads the model, then at the point where it says it's doing a dry run to warm the caches it gives a generic error saying it has failed. I will debug some more, but any tips are welcome!

sredman · 2025-12-17T02:16:31Z

Unfortunately my testing for this has been unsuccessful. LocalAI loads the model, then at the point where it says it's doing a dry run to warm the caches it gives a generic error saying it has failed. I will debug some more, but any tips are welcome!

The same shape of crash happens even with the current release of the ROCm backend! Curious. It looks like the release version is actually using a v6 version of ROCm so should regardless be compatible with my GPUs. I'll see if I can figure out what is going wrong.

sredman · 2025-12-19T15:27:21Z

Responding to myself:

Should I include these flags in the base-level Dockerfile? Since backends are now built/shipped separately, I assume the usage in the base-level Dockerfile are no longer necessary.

Yes. The base-level dockerfile needs to contain all the runtime libraries.

... ROCm 6.3.3 is the last release which supports GCN5.0 (Vega 10) cards ...

I don't know where I read this, but it is apparently false. I don't know what the last version was which supported GCN5.0, but it was quite ancient (before 5.0.0 which was released in 2022).

Given the lack of support, it appears likely that a llama.cpp change is trying to use a feature which my cards do not support. LocalAi v2.29.0 works fine with ROCm backend, while the Local AI ROCm backend tagged v3.2.0 throws an error indicating "MUL_MAT" is not supported. I will experiment a bit more, but I think the best move for me is to switch to Vulkan.

sredman · 2025-12-19T15:29:26Z

Dockerfile

 ARG CUDA_MAJOR_VERSION=12
 ARG CUDA_MINOR_VERSION=0
+ARG ROCM_MAJOR_VERSION=5
+ARG ROCM_MINOR_VERSION=5.1 # ROCm version to append to the major version, in the format of their apt repo (https://repo.radeon.com/rocm/apt/). Like `0_alpha` or `3.4`.


The goal is to update the ROCM_MAJOR_VERSION and ROCM_MINOR_VERSION to match whatever is deployed in LocalAI today. I haven't 100% figured this out. I believe it is v5.5.1, as that is the version in the Ubuntu 22.04 repo.

AMD has tricked me again :)
hipblas-dev and rocblas-dev are not in the Ubuntu repos. This Dockerfile gets them because in the build we override BASE_IMAGE with BASE_IMAGE="rocm/dev-ubuntu-22.04:{tag}" -- This defines which version of ROCm libraries become available.
As of this writing, v3.8.0 of LocalAI was built with v6.4.3 of ROCm. That's what I'll use for the defaults here.

sredman · 2025-12-19T15:31:26Z

Vega 10 cards aside, I think this PR still has value. I have tested it with the latest release (7.1.1) on my laptop with an integrated 780m (gfx1103). It fails because it is the primary GPU for my system so ROCm ends up fighting over the memory with my display, but it does appear to basically work, so I suspect this PR would enable support for other newer GPUs as well.

mudler · 2025-12-19T20:34:49Z

Some questions for maintainers:

* Should I include these flags in the base-level Dockerfile? Since backends are now built/shipped separately, I assume the usage in the base-level Dockerfile are no longer necessary.

Correct, unless there are runtime deps that are really required.

* Should I make similar changes to the other backend Dockerfiles? I likely will only be able to test llama-cpp.

I would suggest to just do the changes that you are able to test. We can have a phased approach for the other backends as necessary

* Would you accept changing the github pipelines to build a v6 and v7 version of ROCm, similar to how we have a v11 and v12 build for CUDA?

Yes absolutely!

  * Assuming the answer to the above question is "yes", would it be OK to make a breaking change to the ROCm backend name/tag, to better match the CUDA build? So we'd have something like `*-gpu-amd-rocm-6-llama-cpp` and `*-gpu-amd-rocm-7-llama-cpp`.

yes sounds good. We are actually really close for the next release and there are already few changes to the images (introducing cuda 13 for example), so this fits perfectly in line with that

sredman · 2025-12-20T01:36:26Z

Slight commit history disaster due to the merge 😄

I will mark this PR as non-draft now. I have made the CI changes and the breaking changes to the image names. While I cannot reliably test this change, I believe it at least will do no harm. The v6 images should be identical to before, and if the v7 images are in some way broken, consumers can choose to not use them.

Signed-off-by: Simon Redman <[email protected]>

…have been replaced by the libraries from AMD Signed-off-by: Simon Redman <[email protected]>

Signed-off-by: Simon Redman <[email protected]>

… as previously needed Signed-off-by: Simon Redman <[email protected]>

Signed-off-by: Simon Redman <[email protected]>

mudler · 2025-12-20T18:44:48Z

.github/workflows/backend.yml

            context: "./backend"
            ubuntu-version: '2204'
          - build-type: 'hipblas'
-            cuda-major-version: ""


we should actually add the new rocm arguments here: https://github.com/mudler/LocalAI/blob/master/.github/workflows/backend_build.yml#L15

I am not sure I understand this comment correctly. Do you mean simply add rocm-*-version around L15 of backend_build.yaml? I have done that. If you mean something else, please guide me 😄

.github/workflows/backend.yml

Signed-off-by: Simon Redman <[email protected]>

…of 'amd' Signed-off-by: Simon Redman <[email protected]>

Signed-off-by: Simon Redman <[email protected]>

sredman force-pushed the work/sredman/rocm-version-control branch from 55c9137 to 3540ea3 Compare December 16, 2025 19:28

sredman force-pushed the work/sredman/rocm-version-control branch 2 times, most recently from 9004896 to aa2a232 Compare December 18, 2025 15:09

sredman commented Dec 19, 2025

View reviewed changes

sredman force-pushed the work/sredman/rocm-version-control branch from a7f7e9c to 2c45d57 Compare December 19, 2025 15:32

mudler marked this pull request as ready for review December 19, 2025 20:31

mudler marked this pull request as draft December 19, 2025 20:31

sredman force-pushed the work/sredman/rocm-version-control branch from 238e1a8 to 9bc5fba Compare December 20, 2025 01:34

github-actions bot added the dependencies label Dec 20, 2025

sredman force-pushed the work/sredman/rocm-version-control branch from 9bc5fba to 104376a Compare December 20, 2025 01:34

sredman marked this pull request as ready for review December 20, 2025 01:36

sredman added 12 commits December 19, 2025 20:38

Add build arg for ROCm version

579f0b7

Signed-off-by: Simon Redman <[email protected]>

Break ROCM_VERSION into ROCM_{MAJOR,MINOR}_VERSION

43522f8

Signed-off-by: Simon Redman <[email protected]>

Use correct ROCm package names

3297f64

Signed-off-by: Simon Redman <[email protected]>

Add rocm package for runtime libs

2b38e07

Signed-off-by: Simon Redman <[email protected]>

Remove hipblas-dev and rocblas-dev. I think they are not needed, and …

8a2451f

…have been replaced by the libraries from AMD Signed-off-by: Simon Redman <[email protected]>

Migrate ROCM build flags to llama-cpp docker file

bf5ccc3

Signed-off-by: Simon Redman <[email protected]>

Change base-level Dockerfile back to same runtime dependency packages…

9cbe3a6

… as previously needed Signed-off-by: Simon Redman <[email protected]>

Use default ROCm version of 5.5.1 to match Ubuntu package

64a111e

Signed-off-by: Simon Redman <[email protected]>

Remove ROCm runtime package from llamacpp dockerfile

958fb9d

Signed-off-by: Simon Redman <[email protected]>

Annotate ROCm runtime dependencies

d407465

Signed-off-by: Simon Redman <[email protected]>

Remove comments in runtime dependencies install step

d562299

Signed-off-by: Simon Redman <[email protected]>

Change default ROCM version to 6.4.3

2542449

Signed-off-by: Simon Redman <[email protected]>

sredman added 2 commits December 19, 2025 20:38

Rename ROCm6 build in image CI output to -gpu-amd-rocm-6

8f1036c

Signed-off-by: Simon Redman <[email protected]>

Add ROCm 7 image builds

e8f17b7

Signed-off-by: Simon Redman <[email protected]>

sredman force-pushed the work/sredman/rocm-version-control branch from 104376a to e8f17b7 Compare December 20, 2025 01:38

mudler reviewed Dec 20, 2025

View reviewed changes

.github/workflows/backend.yml Show resolved Hide resolved

sredman added 4 commits December 21, 2025 16:08

Add rocm-*-version arguments to backend.yml header

748352c

Signed-off-by: Simon Redman <[email protected]>

Change 'amd' capability to 'amd-rocm-'

b0eb4b3

Signed-off-by: Simon Redman <[email protected]>

Translate all backend index.yaml entries to use 'amd-rocm-6' instead …

bc9eafa

…of 'amd' Signed-off-by: Simon Redman <[email protected]>

Add backend/index.yaml entries for llama-cpp on rocm7

cfeeff8

Signed-off-by: Simon Redman <[email protected]>

sredman force-pushed the work/sredman/rocm-version-control branch from 4ce161b to cfeeff8 Compare December 21, 2025 21:08

sredman added 2 commits December 21, 2025 16:22

Update docker tags for previously mis-named backends

3823f0d

Signed-off-by: Simon Redman <[email protected]>

Bulk update documentation with new image tag names

4941f13

Signed-off-by: Simon Redman <[email protected]>

github-actions bot added the kind/documentation Improvements or additions to documentation label Dec 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(ROCm): Allow selecting ROCm version when building llama.cpp backend #7615

feat(ROCm): Allow selecting ROCm version when building llama.cpp backend #7615

sredman commented Dec 16, 2025 •

edited

Loading

Uh oh!

netlify bot commented Dec 16, 2025 •

edited

Loading

Uh oh!

sredman commented Dec 17, 2025

Uh oh!

sredman commented Dec 17, 2025

Uh oh!

sredman commented Dec 19, 2025

Uh oh!

sredman Dec 19, 2025

Uh oh!

sredman Dec 20, 2025

Uh oh!

sredman commented Dec 19, 2025

Uh oh!

mudler commented Dec 19, 2025

Uh oh!

sredman commented Dec 20, 2025

Uh oh!

mudler Dec 20, 2025

Uh oh!

sredman Dec 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat(ROCm): Allow selecting ROCm version when building llama.cpp backend #7615

Are you sure you want to change the base?

feat(ROCm): Allow selecting ROCm version when building llama.cpp backend #7615

Conversation

sredman commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for localai ready!

Uh oh!

sredman commented Dec 17, 2025

Uh oh!

sredman commented Dec 17, 2025

Uh oh!

sredman commented Dec 19, 2025

Uh oh!

sredman Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

sredman Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

sredman commented Dec 19, 2025

Uh oh!

mudler commented Dec 19, 2025

Uh oh!

sredman commented Dec 20, 2025

Uh oh!

mudler Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

sredman Dec 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sredman commented Dec 16, 2025 •

edited

Loading

netlify bot commented Dec 16, 2025 •

edited

Loading