-
Notifications
You must be signed in to change notification settings - Fork 751
Open
Labels
module: trainingIssues related to training models on edge devicesIssues related to training models on edge devices
Description
🚀 The feature, motivation and pitch
I’m working on a research project to fine-tune a model on Android devices. I am exploring using ExecuTorch + QNN or XNNPACK backend for inference acceleration, but need to ensure that the backend can support dynamic modification of weights (i.e., after initialization, allow updating weights / biases, and then run forward again).
What I found
- In executorch/extension/training/examples/XOR/train.cpp, training code based on executorch is provided, but it does not mention the supported backends.
- The official XNNPACK backend documentation describes that, during runtime initialization, weights / biases are “packed” (i.e. weight packing) into XNNPACK’s internal data structures, and the original preprocessed blob’s data is freed. This seems to imply that weights become static / immutable from the perspective of the backend’s execute graph.
- I did not find description in the docs or runtime API of any mechanism to “unlock” or “update” those packed weights at runtime.
- There is an existing issue (PT2E Dynamic Quantization with XNNPACK Backend for Android fails Loading of method forward in runtime #11355) reporting that even dynamic quantization + XNNPACK + Android may fail to load “forward” method, which suggests that non-static quantization / dynamic behavior is fragile or unsupported.
- For QNN backend, I saw open / triaged issues about compilation or binary loading, but none that explicitly mention support for runtime weight update.
My questions
- Does ExecuTorch (any of its backends: QNN, XNNPACK, Vulkan, etc.) currently support runtime in-place weight updates (i.e. treat model weights as mutable parameters, allow updating them between forward calls, as required in fine-tuning / training / zeroth-order optimization)?
- If not supported, is there a recommended workflow / workaround for on-device fine-tuning with ExecuTorch? Or is this explicitly out of scope?
- If it’s not currently supported, would the maintainers be open to considering such a feature in future (e.g. a “mutable weight” delegate, or mechanism to reload new weights into backend graph)?
Thank you for your time and for developing ExecuTorch — it is a great tool for on-device inference / deployment, and I hope it can support on-device fine-tuning in the future.
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
Metadata
Metadata
Assignees
Labels
module: trainingIssues related to training models on edge devicesIssues related to training models on edge devices
Type
Projects
Status
To triage