-
Notifications
You must be signed in to change notification settings - Fork 97
Open
Description
I have been attempting to run the NewsVendor environment, with these specific RL and Env configurations (copied from the example notebooks), and it seems like the reward is stuck at around -20,000, after around 500 iterations.
I was wondering what particular configurations I may need to adjust, to see improvements in the reward?
Metadata
Metadata
Assignees
Labels
No labels
