-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fixes device decoupling of RL Games training vs. sim device #3997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryThis PR enables device decoupling for RL Games, allowing simulation to run on one device (e.g., CPU) while training runs on another (e.g., GPU). The key changes are: Core Implementation:
How It Works:
Testing:
The implementation is clean, well-documented, and follows the existing architecture. Device transfers use Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Policy as RL Policy<br/>(rl_device)
participant Wrapper as RlGamesVecEnvWrapper
participant Env as Isaac Environment<br/>(sim_device)
Note over Policy,Env: Training Step Flow
Policy->>Wrapper: step(actions)<br/>[on rl_device]
Wrapper->>Wrapper: actions.to(sim_device)<br/>Transfer to simulation
Wrapper->>Env: step(actions)<br/>[on sim_device]
Env-->>Wrapper: obs, rew, term, trunc<br/>[on sim_device]
Wrapper->>Wrapper: _process_obs(obs_dict)<br/>Transfer obs to rl_device
Wrapper->>Wrapper: rew.to(rl_device)<br/>dones.to(rl_device)<br/>extras.to(rl_device)
Wrapper-->>Policy: obs, rew, dones, extras<br/>[on rl_device]
Note over Policy,Env: Reset Flow
Policy->>Wrapper: reset()
Wrapper->>Env: reset()
Env-->>Wrapper: obs_dict<br/>[on sim_device]
Wrapper->>Wrapper: _process_obs(obs_dict)<br/>Transfer to rl_device
Wrapper-->>Policy: obs<br/>[on rl_device]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 files reviewed, no comments
ooctipus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test is a bit long to read, and feels like could be refactored to make a bit nicer. But everything else looks pretty good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like this test file can be refactored a bit :D, that way the step 1 - 5 maybe more clear, easier for future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6 files reviewed, no comments
Description
Allows rlgames to decouple devices for simulation and training. This should allow running simulation on CPU and training on GPU
Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there