-
Notifications
You must be signed in to change notification settings - Fork 117
Open
Labels
kind:choreInternal improvementsInternal improvementsnote:upstreamThe issue must be tackled upstreamThe issue must be tackled upstream
Description
Many models have an option to share parameters between the input embedding layer and an output dense layer. We need a solution for that in Axon, but I open this issue, since we already have a lot of TODOs in the code for this specific issue.
The reason loading currently works is that the PyTorch .bin export includes both layers and both point to a tensor. In case of safetensors, only one of the layers may be present, so this issues is a prerequisite for defaulting to safetensors (#255).
For additional discussion see #263.
Metadata
Metadata
Assignees
Labels
kind:choreInternal improvementsInternal improvementsnote:upstreamThe issue must be tackled upstreamThe issue must be tackled upstream