[WIP] [tx] Fix loss_fn always defaulting to cross_entropy in SkyRL backend#1053
Draft
tyler-griggs wants to merge 2 commits intomainfrom
Draft
[WIP] [tx] Fix loss_fn always defaulting to cross_entropy in SkyRL backend#1053tyler-griggs wants to merge 2 commits intomainfrom
tyler-griggs wants to merge 2 commits intomainfrom
Conversation
The SkyRL backend's forward_backward() had a loss_fn parameter defaulting to "cross_entropy" that was never passed by the engine, so all requests used cross_entropy regardless of what the user specified. Now derives loss_fn from prepared_batch.all_loss_fn_types, matching the AbstractBackend signature. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename all_loss_fn_types (list[int]) to all_loss_fns (list[str]). The int codes were a JAX-specific detail leaking into the shared batch interface. Now the engine stores the string names directly, the SkyRL backend uses them as-is, and the JAX backend converts to ints internally where needed for jax.lax.switch. Also fixes the SkyRL backend always using "cross_entropy" regardless of what the user specified — it now reads loss_fn from the batch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
forward_backward()had aloss_fnparameter defaulting to"cross_entropy"that was never passed by the engine, so all requests silently used cross-entropy regardless of what the user specified (e.g.importance_sampling,ppo).loss_fnfromprepared_batch.all_loss_fn_typesusing a reverse mapping, matching how the JAX backend already handles this correctly.loss_fnparameter to match theAbstractBackendsignature.