GitHub - Lab-RoCoCo-Sapienza/map-vlm

Multi-agent Planning using Visual Language Models

Michele Brienza¹, Francesco Argenziano¹, Vincenzo Suriani², Domenico D. Bloisi³ Daniele Nardi¹,

¹ Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy, ² School of Engineering, University of Basilicata, Potenza, Italy, ³ International University of Rome UNINT, Rome, Italy

Dataset

This project uses the G-PlanET dataset, which must be downloaded from Hugging Face:

Dataset: yuchenlin/G-PlanET

Downloading the Dataset

You can download the dataset using the Hugging Face datasets library:

from datasets import load_dataset

dataset = load_dataset("yuchenlin/G-PlanET")

Or using the Hugging Face CLI:

huggingface-cli download yuchenlin/G-PlanET

Trails

The trails (ID and image) used in this project are available on the project website:

Website: https://lab-rococo-sapienza.github.io/map-vlm/

Installation

pip install -r requirements.txt

Installing PG2S Metric

To evaluate the generated plans, you need to install the PG2S metric library:

pip install pg2s

Or install from source:

git clone https://github.com/Lab-RoCoCo-Sapienza/pg2s
cd pg2s
pip install .

Usage

The main script test.py processes JSONL entries and generates planning outputs using both table-based and vision-based agents.

Basic Usage

python test.py

Command Line Arguments

--jsonl: Path to input JSONL file (default: example.jsonl)
--limit: Maximum number of records to process (default: 1)
--id: Process only the record with matching ID (optional)
--image: Path to image file for vision planning (default: 4.jpg)
--model-table: OpenAI model for table planning (default: gpt-4o)
--model-vision: OpenAI model for vision planning (default: gpt-4o)
--output-dir: Directory to save generated plans (default: output_plans)

Examples

Process a single record:

python test.py --jsonl example.jsonl --limit 1

Process multiple records with custom output directory:

python test.py --jsonl example.jsonl --limit 5 --output-dir ./results

Process a specific record by ID:

python test.py --jsonl example.jsonl --id "record_123"

Output Structure

For each processed record, the script creates a subdirectory record_{id} containing:

input_table.txt - Input table in markdown format
single_agent_table.txt - Plan generated by single-agent with table
multi_agent_table_env.txt - Environment summary from multi-agent with table
multi_agent_table_plan.txt - Plan generated by multi-agent with table
single_agent_vision.txt - Plan generated by single-agent with vision
multi_agent_vision.txt - Plan generated by multi-agent with vision

Evaluation

This project uses the PG2S metric to evaluate the quality of generated plans.

Using PG2S

from pg2s.metric import pg2s_score

plans = {
    "task-1": {
        'truth': [
            'Turn around and walk to the sink.',
            'Take the left glass out of the sink.',
            'Turn around and walk to the microwave.',
            'Heat the glass in the microwave.',
            'Turn around and face the counter.',
            'Place the glass in the left top cabinet.'
        ],
        'predict': [
            'Walk to the sink.',
            'Pick up the glass from the sink.',
            'Go to the microwave.',
            'Heat the glass.',
            'Walk to the counter.',
            'Put the glass in the cabinet.'
        ]
    },
}

# Calculate the similarity score with a custom alpha value
# alpha controls the balance between goal-wise and sentence-wise similarity
score = pg2s_score(plans, alpha=0.7)
print(f"PG2S Score: {score}")

PG2S Parameters

plans: Dictionary containing tasks with ground truth and predicted action sequences
alpha: Hyperparameter (default: 0.5) that balances:
- Goal-wise similarity
- Sentence-wise similarity

For more information, see the PG2S repository.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
4.jpg		4.jpg
LICENSE		LICENSE
README.md		README.md
agents.py		agents.py
evaluate_plan.py		evaluate_plan.py
example.jsonl		example.jsonl
pg2s.py		pg2s.py
requirements.txt		requirements.txt
rule_based_extraction.py		rule_based_extraction.py
table.py		table.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-agent Planning using Visual Language Models

Dataset

Downloading the Dataset

Trails

Installation

Installing PG2S Metric

Usage

Basic Usage

Command Line Arguments

Examples

Output Structure

Evaluation

Using PG2S

PG2S Parameters

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-agent Planning using Visual Language Models

Dataset

Downloading the Dataset

Trails

Installation

Installing PG2S Metric

Usage

Basic Usage

Command Line Arguments

Examples

Output Structure

Evaluation

Using PG2S

PG2S Parameters

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages