truncated_generalized_advantage_estimation should have the stop_target_gradients defaulted to True
https://github.com/deepmind/rlax/blob/383f93bc8b33c3d1bc28f15e1e07fc5104c790ea/rlax/_src/multistep.py#L279
The False case applies only to meta-gradients use case which is rare in the common agents. We should mark this option as defaulted to be True to avoid usage bugs.
WDYT?
truncated_generalized_advantage_estimationshould have thestop_target_gradientsdefaulted toTruehttps://github.com/deepmind/rlax/blob/383f93bc8b33c3d1bc28f15e1e07fc5104c790ea/rlax/_src/multistep.py#L279
The
Falsecase applies only to meta-gradients use case which is rare in the common agents. We should mark this option as defaulted to beTrueto avoid usage bugs.WDYT?