Reward functions you can run on your videos:
- TOPReward — from TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics by @jcoleharrison , @chinsengi , UW
- Robometer - from Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons by @ygtkorkmaz , @aliang8 , USC
- RoboReward — from RoboReward: General-Purpose Vision-Language Reward Model for Robotics by @teetone, @ajwagenmaker, @kpertsch, Stanford & Berkeley
- Generative Value Learning (GVL) — from Vision Language Models are In-Context Value Learners, Google DeepMind
- Brute Force — at each frame, sends the video up to that point to the VLM and asks for a progress score between 0.0 and 1.0
...and easy to add more!
-
Create an MP4 video of robot manipulation (example). For efficiency, please downsize the file to 480p, as image pixels are passed as tokens.
-
Install prerequisites
virtualenv venv . venv/bin/activate pip install -r requirements.txt pip install torch torchvision transformers accelerate qwen-vl-utils -
Run the reward algorithms on your video:
A. Run
topreward,roboreward,gvl, and/orbruteforce_vlmRun the script to calculate reward functions on your video:
python run_rewards.py --method topreward,roboreward,gvl,bruteforce_vlm --video <myvideo.mp4> --instruction <instructions e.g. `create a tower of 5 cubes`>Notes:
- If you are running
gvlorbruteforce_vlm, you must include an OpenAI API key:
--openai-api-key <your key>- If you are running
toprewardand/orroboreward, you'll need at least 16GB unified/GPU memory.
B. Run
robometerCompute the reward using the custom script here.
- If you are running
-
View the results in your browser:
./run_viewer.sh
