Description
I recently cloned the repository and attempted to run the default example following the README instructions. However, the training process fails to converge. Actually, it does not converge when using QMix.
Modifications
I have not modified any core logic.
The only change made was adding TensorBoard logging for visualization.
Steps to Reproduce
Clone the repository.
Install dependencies.
Run the following command: python benchmarl/run.py algorithm=mappo task=vmas/balance
Expected Behavior
The model should show learning progress and converge on the vmas/balance task, distance to goal should become shorter and shorter.
Actual Behavior
The training does not converge.Please see the attached TensorBoard screenshots below:
