feat(converter): Add support for Gemma 3N models #6120
      
        
          +72
        
        
          −2
        
        
          
        
      
    
  
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Fixes #6049
Description
This pull request adds support for the new Gemma 3N family of models to the
SafetensorsCkptLoader, enabling developers to convert these powerful, on-device multimodal models to the MediaPipe.taskformat.Gemma 3N is specifically designed for high-performance on-device use. It introduces several key architectural innovations, including:
language_model.prefix.This PR updates the converter to correctly handle this nested structure, making these state-of-the-art models accessible to the MediaPipe community.
Changes Implemented
The implementation was carefully revised to ensure no regressions were introduced for previously supported models.
mediapipe/tasks/python/genai/converter/safetensors_converter.pyis_nestedflag in theGemmaMapperconstructor. This replaces a more specific approach and is designed to handle any Gemma model with the nestedlanguage_model.prefix.update_target_namemethod now uses thisis_nestedflag to conditionally strip the prefix. This is a more robust solution that avoids breaking support for the existingGemma3-4Bmodel, which also uses this nested structure.SafetensorsCkptLoaderhas been updated to recognize the new special model names for the Gemma 3N series (e.g.,GEMMA3N_4B,GEMMA_3N_E2B_IT, etc.).is_nested=Trueflag for both the new Gemma 3N models and the pre-existingGemma3-4B, ensuring both are handled correctly.rawvsraw_tensor) in theread_tensor_as_numpyfunction.mediapipe/tasks/python/genai/converter/safetensors_converter_test.pytestNestedGemmaConversion) has been added.GEMMA3N_4B) and the existing nested model (Gemma3-4B).language_model.prefix is correctly stripped from all relevant tensor names.vision_tower,multi_modal_projector) are correctly identified and skipped.A Note on Testing
As detailed in the original issue, setting up the full local build environment for Python changes on an Apple Silicon Mac proved to be exceptionally challenging. Therefore, while the core logic has been thoroughly validated with the new parameterized unit tests, a full, end-to-end conversion could not be performed locally.
I would be grateful if the maintainers could rely on the project's CI pipeline and their own established environments for final validation. This contribution should enable many developers to bring the latest on-device models to their mobile applications.
Thank you for your consideration