Skip to content

[WIP] [tx] Fix loss_fn always defaulting to cross_entropy in SkyRL backend#1053

Draft
tyler-griggs wants to merge 2 commits intomainfrom
tyler/fix-loss-fn-dispatch
Draft

[WIP] [tx] Fix loss_fn always defaulting to cross_entropy in SkyRL backend#1053
tyler-griggs wants to merge 2 commits intomainfrom
tyler/fix-loss-fn-dispatch

Conversation

@tyler-griggs
Copy link
Member

@tyler-griggs tyler-griggs commented Feb 8, 2026

Summary

  • The SkyRL backend's forward_backward() had a loss_fn parameter defaulting to "cross_entropy" that was never passed by the engine, so all requests silently used cross-entropy regardless of what the user specified (e.g. importance_sampling, ppo).
  • Now derives loss_fn from prepared_batch.all_loss_fn_types using a reverse mapping, matching how the JAX backend already handles this correctly.
  • Also removes the extra loss_fn parameter to match the AbstractBackend signature.

The SkyRL backend's forward_backward() had a loss_fn parameter defaulting
to "cross_entropy" that was never passed by the engine, so all requests
used cross_entropy regardless of what the user specified. Now derives
loss_fn from prepared_batch.all_loss_fn_types, matching the AbstractBackend
signature.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename all_loss_fn_types (list[int]) to all_loss_fns (list[str]). The int
codes were a JAX-specific detail leaking into the shared batch interface.
Now the engine stores the string names directly, the SkyRL backend uses
them as-is, and the JAX backend converts to ints internally where needed
for jax.lax.switch.

Also fixes the SkyRL backend always using "cross_entropy" regardless of
what the user specified — it now reads loss_fn from the batch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@tyler-griggs tyler-griggs changed the title [tx] Fix loss_fn always defaulting to cross_entropy in SkyRL backend [WIP] [tx] Fix loss_fn always defaulting to cross_entropy in SkyRL backend Feb 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant