rlbook

Code for my walkthrough of: Reinforcement Learning An Introduction by Richard Sutton and Andrew Barto

Installation

Install uv (for linux below):

curl -LsSf https://astral.sh/uv/install.sh | sh

Link to instructions for other OS's

Install the rlbook environment via uv:

uv sync --extra gpu

Run commands by first activating the rlbook venv (this is my preferred workflow):

source ./venv/bin/activate

Documentation

Available at https://joseph-jnl.github.io/rlbook/.

Quickstart

Algorithm implementations are located in the /src directory while the scaffolding code/notebooks for recreating/exploring Sutton & Barto are segmented into the experiments/ directory.

e.g. for recreating Figure 2.3, navigate to /experiments/ch2_bandits/ and run:

python run.py -m run.steps=1000 run.n_runs=2000 +bandit.epsilon=0,0.01,0.1 +bandit.random_argmax=true experiment.tag=fig2.2 experiment.upload=true

Figure 2.3 (rlbook): The +bandit.random_argmax=true flag was used to switch over to an argmax implementation that randomizes between tiebreakers rather than first occurence used in the default numpy implementation to better align with the original example. Link to wandb artifact

Further details on experimental setup and results can be found at corresponding chapter README's.

Chapter Links

Chapter 2: Multi-armed Bandits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rlbook

Installation

Install uv (for linux below):

Install the rlbook environment via uv:

Documentation

Quickstart

Chapter Links

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

rlbook

Installation

Install uv (for linux below):

Install the rlbook environment via uv:

Documentation

Quickstart

Chapter Links