This codebase is used to generate the In-domain QA data items used for VLM pretraining, based on open-pi-zero. You can refer to open-pi-zero for more details of the codebase.
-
Create a conda virtual environment and activate it:
conda create -n vlaser python=3.10 conda activate vlaser
-
Install dependencies using requirements.txt:
pip install -r requirements.txt
if you have problem while installing dependencies with requirements.txt, you can install the dependencies from InternVL and open-pi-zero step by step.
cd data-pipeline
bash slurm/data_generator.sh [bridge|fractal] [general|spatial|grounding]This data script supports data generation for WidowX(bridge) and Google Robot(fractal), ranging data types including general QA, spatial intelligence QA as well as grounding QA.
For data quality filtering, please refer to:
cd data-pipeline
python src/agent/filter.py \
--input_folder xxx # Path for jsonl containing generated data above
--image_root xxx # Path for image root path
--output_root xxx # Path for data items after filteringcd data-pipeline/RoboTwin-QA
python [GeneralQA|GroundingQA|SpatialQA].pyThis data script supports data generation for RoboTwin 2.0, ranging data types including general QA, spatial intelligence QA as well as grounding QA.
This codebase is also support a third-party implement of $\pi_{0}$, you can find the details here.
This codebase is based on open-pi-zero. Thank you for the great work!