fix: add gguf wrapper in quantized gemma3 #3220

junjunjd · 2025-12-02T02:28:59Z

Add Gguf wrapper in quantized Gemma3 loader to fix missing metadata lookups.

ivarflakstad · 2025-12-23T11:49:57Z

@junjunjd Thanks for this!
I tried running the embedding-gemma model mentioned in the issue and unfortunately this doesn't fix the issue. Luckily the problem isn't has complex as it may seem.
Default gemma3 uses the prefix gemma3.*, while the embedding-gemma model uses gemma-embedding.* (ref).
ModelWeights in quantized_gemma3.rs is hard coded to use gemma3.
I'll post the findings in the issue as well.

clocksmith · 2025-12-30T20:19:07Z

@junjunjd Thanks for taking a crack at this! The GGUF wrapper is a nice cleanup for the tensor loading calls

Unfortunately this doesn't quite fix the root issue. The problem is that md_get("gemma3.attention.head_count") still has the prefix hardcoded , and the wrapper passes through to the same metadata without transforming the keys.

Like @ivarflakstad mentioned

Regular Gemma3 models use gemma3.* keys
Embedding-gemma uses gemma-embedding.* keys (https://huggingface.co/unsloth/embeddinggemma-300m-GGUF)

If the actual prefix can be detected before lookups, then the problem can be addressed there. Luckily it can!

The fix

Right before this existing on line 266 (currently not as robust):

let md_get = |s: &str| match ct.metadata.get(s) {
      None => candle::bail!("cannot find {s} in metadata"),
      Some(v) => Ok(v),
  };

This can be added, along with a tweak the let md_get block above

  // NEW
  let prefix = ["gemma3", "gemma2", "gemma", "gemma-embedding"]
      .iter()
      .find(|p| ct.metadata.contains_key(&format!("{}.attention.head_count", p)))
      .copied()
      .unwrap_or("gemma3");

  // TWEAKED
  let md_get = |s: &str| {
      let key = format!("{prefix}.{s}");
      match ct.metadata.get(&key) {
          None => candle::bail!("cannot find {key} in metadata"),
          Some(v) => Ok(v),
      }
  };

Then replace all instance of md_get("gemma3.attention.head_count") with md_get("attention.head_count") in the following lines 271-285.

Notes

BONUS: This proposed solutionalso has the advantage of robustness by "auto" discovering any future Gemma variants (gemma4, gemma-vision, etc.) without code changes.

NOTE: This wrapper could still be useful alongside this for a cleaner tensor API - they're complementary changes!

ALTS: I had this issue on something similar, but solution was not as tight, here: https://github.com/clocksmith/gamma/blob/83d0565c938b18a363026070b526885682450c79/src/core/hardware/gguf_parser.py#L129 This also works, but the hardcoded list of names is probably better for a library of this maturity.

Happy to help further if you'd like to update the PR.

ivarflakstad · 2025-12-31T10:54:02Z

Closing since the root issue has been fixed. If you want to contribute with the intention of improving the gguf API feel free :)

fix: add gguf wrapper in quantized gemma3

900e0d8

junjunjd force-pushed the fix/gemma3-gguf-loader branch from abd0e3d to 900e0d8 Compare December 6, 2025 04:55

ivarflakstad mentioned this pull request Dec 22, 2025

cannot find gemma3.attention.head_count in metadata #3215

Closed

Merge branch 'main' into fix/gemma3-gguf-loader

3fabab8

ivarflakstad closed this Dec 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: add gguf wrapper in quantized gemma3 #3220

fix: add gguf wrapper in quantized gemma3 #3220

Uh oh!

junjunjd commented Dec 2, 2025

Uh oh!

ivarflakstad commented Dec 23, 2025

Uh oh!

clocksmith commented Dec 30, 2025 •

edited

Loading

Uh oh!

ivarflakstad commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: add gguf wrapper in quantized gemma3 #3220

fix: add gguf wrapper in quantized gemma3 #3220

Uh oh!

Conversation

junjunjd commented Dec 2, 2025

Uh oh!

ivarflakstad commented Dec 23, 2025

Uh oh!

clocksmith commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The fix

Notes

Uh oh!

ivarflakstad commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clocksmith commented Dec 30, 2025 •

edited

Loading