Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
9326c83
sparsedrive baseline
sandropapais Feb 12, 2026
0db2dd9
added detection anchor propogation based on motion prediction
sandropapais Feb 18, 2026
149e940
update git ignore
sandropapais Feb 18, 2026
d4a2489
Merge sd_anchorprop into sd
sandropapais Feb 18, 2026
ff18c99
reverted anchor_prop changes for baseline model
sandropapais Feb 18, 2026
2273e28
added partial occ mask
sandropapais Feb 19, 2026
a9d6870
Added sparse4d configs
sandropapais Feb 19, 2026
764cfb7
added test logging
sandropapais Feb 19, 2026
ce994af
added multiple prediciton refinements experiment
sandropapais Feb 19, 2026
9acbaac
removed unused configs
sandropapais Feb 20, 2026
8965682
fixed bug with loading results in eval
sandropapais Feb 20, 2026
e890261
remove breakpoint
sandropapais Feb 20, 2026
e063251
fix typo
sandropapais Feb 20, 2026
8117b80
partial occluded evaluator using existing labels
sandropapais Feb 20, 2026
12f7812
added occlusion counter printouts
sandropapais Feb 21, 2026
a8f9822
added config
sandropapais Feb 22, 2026
6901121
flash attention patch for sparse4d
sandropapais Feb 22, 2026
5075710
updated sparse4d configs
sandropapais Feb 22, 2026
589ee0c
fixed occluded mask use_valid_flag interaction
sandropapais Feb 22, 2026
fd9a8fa
fixed denominator for occluded/obj_box_col metric
sandropapais Feb 22, 2026
9f8b0a4
added val/all metrics
sandropapais Feb 22, 2026
59d90b2
added rotaug
sandropapais Feb 23, 2026
97fec71
updated occp configs
sandropapais Feb 23, 2026
2d5de23
removed old config
sandropapais Feb 23, 2026
1e6eb73
added dn config
sandropapais Feb 23, 2026
cda8426
config typo
sandropapais Feb 23, 2026
9c5a489
initial pred only setup
sandropapais Feb 23, 2026
ce0755b
freeze det head
sandropapais Feb 23, 2026
f53d824
Added CTRA motion model and fixed eval
sandropapais Feb 24, 2026
7fc5dea
fix ca and ctr
sandropapais Feb 24, 2026
4e2d4ea
bug fix ca
sandropapais Feb 24, 2026
15f9514
fixed ca and ctr bug
sandropapais Feb 24, 2026
34e3d5f
added deformable model
sandropapais Feb 24, 2026
b923ae8
added traj refinement
sandropapais Feb 24, 2026
cf8ee7c
added rand init stage2 pred
sandropapais Feb 24, 2026
e3c99b3
added brier_FDE and top1`_FDE metrics
sandropapais Feb 24, 2026
79df1d2
Merge branch 'sd_predonly' into sd
sandropapais Feb 24, 2026
9e9f277
initial sephead design
sandropapais Feb 24, 2026
083f4df
find unused
sandropapais Feb 24, 2026
dfbcd89
separated params fix
sandropapais Feb 24, 2026
280d4fe
always enable first frame
sandropapais Feb 24, 2026
377203e
remove ffn
sandropapais Feb 24, 2026
445f40f
removed refine layer
sandropapais Feb 24, 2026
8c2be37
added refiner
sandropapais Feb 24, 2026
e6e5f40
aux loss for sep head
sandropapais Feb 25, 2026
1c1a530
updated cc_run
sandropapais Feb 25, 2026
1ec07ea
fix for first sample
sandropapais Feb 25, 2026
c77aaaf
update
sandropapais Feb 25, 2026
e40310a
Merge pull request #3 from TRAILab/sd_evaloccp
sandropapais Feb 25, 2026
ed583e2
Merge pull request #4 from TRAILab/sd
sandropapais Feb 25, 2026
1b59339
added full occ data gen and viz
sandropapais Feb 25, 2026
831e496
Merge branch 'sd' into sd_sephead
sandropapais Feb 25, 2026
8c88b36
Merge branch 'sd_sephead' of github.com:TRAILab/ForeSight into sd_sep…
sandropapais Feb 25, 2026
607cd66
new configs
sandropapais Feb 25, 2026
50799a1
typo fix
sandropapais Feb 25, 2026
9727f28
new config
sandropapais Feb 25, 2026
77b5fa6
fixed occluded eval to filter visible preds
sandropapais Feb 26, 2026
66520b7
added visibility head
sandropapais Feb 26, 2026
81657c0
fixed vis head issue
sandropapais Feb 26, 2026
d4691d1
cache fix
sandropapais Feb 26, 2026
7caae13
fix map compatibility with vis head
sandropapais Feb 26, 2026
19c4c7b
remove cache failed fix
sandropapais Feb 26, 2026
f5056b3
fix vis head
sandropapais Feb 26, 2026
9aae31a
typo fix
sandropapais Feb 26, 2026
236416c
added val/vis/ metrics
sandropapais Feb 26, 2026
761b643
fixed eval bug with filter too aggressive
sandropapais Feb 27, 2026
110c56e
updated cc runfile
sandropapais Feb 27, 2026
5994e26
debug setup
sandropapais Feb 27, 2026
85de643
data gen notes
sandropapais Mar 4, 2026
9c88ad5
new config
sandropapais Mar 4, 2026
0e6e7fc
fixed typo
sandropapais Mar 4, 2026
0bae2ca
Merge branch 'sd_sephead' into sd
sandropapais Mar 5, 2026
c83870c
updated occf configs
sandropapais Mar 5, 2026
7d5e76a
typo fix
sandropapais Mar 5, 2026
9b1f8ff
Merge branch 'sd' of github.com:TRAILab/ForeSight into sd
sandropapais Mar 5, 2026
c675e42
new sephead config
sandropapais Mar 5, 2026
25183e3
new config
sandropapais Mar 5, 2026
00b17ef
add ffn to sephead
sandropapais Mar 5, 2026
c44cfb7
added quality_estimation to sephead
sandropapais Mar 5, 2026
48d9a69
added sephead ffn
sandropapais Mar 5, 2026
e9c0095
new dataset converter
sandropapais Mar 5, 2026
dc2c89d
fixed gt_visibility range filtering
sandropapais Mar 5, 2026
9314bc6
removed debug configs
sandropapais Mar 5, 2026
880af8e
new sephead implementation with single frame decoder isolation and vi…
sandropapais Mar 5, 2026
9915329
new pretrained checkpoint config
sandropapais Mar 5, 2026
35e1dfc
config update
sandropapais Mar 5, 2026
3c4eede
added occeval config
sandropapais Mar 5, 2026
b2c3d3c
added adaptive mAPocc
sandropapais Mar 5, 2026
4a885f0
fixed vishead configs loss
sandropapais Mar 5, 2026
9de9d3c
fix vis loss
sandropapais Mar 5, 2026
d843c81
visloss error
sandropapais Mar 5, 2026
c6fb0a2
update configs
sandropapais Mar 8, 2026
230f2ee
updated configs
sandropapais Mar 10, 2026
9b35b67
update configs
sandropapais Mar 10, 2026
7d35824
removed unused configs
sandropapais Mar 11, 2026
7ac0de2
added r101 s2 config
sandropapais Mar 11, 2026
4627a91
added new predonly configs
sandropapais Mar 11, 2026
c12d15f
updated r101 configs
sandropapais Mar 12, 2026
66c51d0
fixed config typos
sandropapais Mar 13, 2026
318d834
typo fix
sandropapais Mar 13, 2026
ebd435a
new config
sandropapais Mar 13, 2026
6dd248a
update predonly configs
sandropapais Mar 13, 2026
dbbb847
removed planning detach
sandropapais Mar 14, 2026
3567075
add config
sandropapais Mar 14, 2026
dcb1d32
detach fix
sandropapais Mar 15, 2026
ef63f7f
remove refine3 configs and revert associated code changes
sandropapais Mar 18, 2026
33b7c49
fix for occ eval without motion/planning
sandropapais Mar 18, 2026
05d77b6
autoresearch mar25 exp-001: plan_loss_up
sandropapais Mar 25, 2026
b148638
autoresearch mar25: pre-stage exp002 and exp003 configs
sandropapais Mar 25, 2026
32d4489
autoresearch mar25: disable WandB hook in all exp configs
sandropapais Mar 25, 2026
9699bc0
autoresearch mar25: restore WandB hook in all exp configs
sandropapais Mar 25, 2026
3c5c271
fix: source ~/.bashrc in dgx_run.sh for WANDB_API_KEY
Mar 25, 2026
dc755f0
autoresearchv1
sandropapais Mar 25, 2026
5446891
autoresearchv1
sandropapais Mar 25, 2026
4be2835
autoresearch mar25 exp-001: log results (keep)
sandropapais Mar 26, 2026
d2d1217
autoresearch mar25 exp-002: log results (discard) + stage exp004
sandropapais Mar 26, 2026
a1fecb2
updated autoresearch files
sandropapais Mar 26, 2026
7aa5406
autoresearch mar25 exp-003: log results (keep, new best L2)
sandropapais Mar 26, 2026
6f994b4
autoresearch mar25 exp-004: log results (keep, new best col) + exp005…
sandropapais Mar 26, 2026
74b5c85
autoresearch mar25 exp-005: log results (discard) + Conclusions
sandropapais Mar 26, 2026
266b920
autoresearch mar25: consolidate research files into autoresearch/
sandropapais Mar 26, 2026
21572f6
merge autoresearch/mar25: 5 experiments, queue_length=6 best L2 (0.57…
sandropapais Mar 26, 2026
c6ded2b
add sparsedrive_r50_stage2_4gpu_nomap_queue6 config
sandropapais Mar 26, 2026
f091551
autoresearch: always run exp000 baseline + fix sbatch template + add …
sandropapais Mar 26, 2026
a1998d9
autoresearch mar26 exp-000: baseline
sandropapais Mar 26, 2026
1e50489
autoresearch mar26: pre-create exp001-005 configs
sandropapais Mar 26, 2026
e6e8521
autoresearch mar26 exp-000: log baseline results
sandropapais Mar 27, 2026
06d245d
autoresearch mar26 exp-001: log results (keep)
sandropapais Mar 27, 2026
bd7a91a
autoresearch mar26 exp-002: log results (keep)
sandropapais Mar 27, 2026
49995e7
autoresearch mar26 exp-003: log results (keep) + update exp005
sandropapais Mar 27, 2026
67278f0
autoresearch mar26 exp-004: log results (discard)
sandropapais Mar 27, 2026
ae7f902
autoresearch mar26: final session summary + research_review update
sandropapais Mar 27, 2026
cca6623
autoresearch mar27: init session + exp000-005 configs
sandropapais Mar 28, 2026
eb576cb
autoresearch mar27 exp-000: log baseline results
sandropapais Mar 28, 2026
7fafeff
autoresearch mar27 exp-001: log results (keep)
sandropapais Mar 28, 2026
130d246
autoresearch mar27 exp-002: log results (discard)
sandropapais Mar 29, 2026
0610b9c
fix cross_gnn guard when with_map=False in motion_planning_head
sandropapais Mar 29, 2026
e6643ad
autoresearch mar27 exp-003: add find_unused_parameters=True for nomap…
sandropapais Mar 29, 2026
c70a50e
autoresearch mar27 exp-003: log crash (×3) + pivot to plan_loss_up only
sandropapais Mar 29, 2026
36eb2f5
autoresearch mar27 exp-003b: log results (discard)
sandropapais Mar 29, 2026
ab3d42d
autoresearch mar27 exp-004: log results (discard) + update exp005
sandropapais Mar 29, 2026
35a947b
autoresearch mar27: final session summary + research_review update
sandropapais Mar 30, 2026
1eaa11b
update notes
sandropapais Mar 31, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
235 changes: 235 additions & 0 deletions .claude/commands/autoresearch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
You are running an autonomous ML research loop for the ForeSight autonomous driving project.

## Setup

Arguments: $ARGUMENTS
Parse:
- `--goal` (required): research objective
- `--base-config` (default: `projects/configs/sparsedrive_r50_stage2_4gpu.py`)
- `--max-experiments` (default: 5)
- `--poll` (default: 30m): how often to check job status (e.g. `30m`, `1h`)

**Agree on a run tag** based on today's date (e.g. `mar25`). The branch `autoresearch/<tag>` must not already exist.

```bash
git checkout -b autoresearch/<tag>
```

Read the base config for full context before proposing anything:
```bash
head -90 <base-config> && echo "---" && tail -50 <base-config>
```

**Initialize `results.tsv`** if it doesn't exist (tab-separated, NOT comma-separated):
```
commit val_L2 val_col% car_ade NDS status description
```

**Run the baseline** — always submit the base config as exp-000 before any experiments. This gives a reproducible reference on the same hardware and code version.

Config stem: `auto_<tag>_exp000_baseline`

Create the config (copy base config, only update the WandB name):
```python
# === autoresearch overrides (auto_<tag>_exp000_baseline) ===
log_config['hooks'][1]['init_kwargs']['name'] = 'auto_<tag>_exp000_baseline'
```

Submit and wait for it to finish exactly as in Steps 3–6 below. Record results as the first `results.tsv` row with status `baseline`. This run does NOT count against `--max-experiments`.

**Initialize `research_log.md`** if it doesn't exist. If it already exists, read it to catch up on prior experiments before proposing.

**Read the project research review** to understand the full experimental landscape, confirmed wins, and known-bad ideas before proposing anything:
```bash
cat autoresearch/research_review.md
```
Pay particular attention to:
- **Section 3** (Experimental Findings): what has already been tried and what the results were — do NOT re-run experiments that are already documented here
- **Section 7** (Prioritized Action Plan): the "Confirmed Wins" table and "Negative/Null Evidence" table — use the confirmed wins as a starting point and never propose ideas from the null/negative list
- **Section 9** (Open Questions): unresolved questions worth answering

## Architecture
- SparseDrive: ResNet → FPN → SparseDriveHead (detection + map + motion/planning)
- Configs are Python files exec()'d by mmdet3d — appending lines at the end overrides earlier values
- Training: 4 GPUs on DGX server (ssh host: `trail_dgx`), repo at `/raid/home/spapais/ForeSight`
- Each experiment takes ~4 hours

## Key Metrics (nuScenes val)
- **L2**: ego planning L2 error in meters (lower = better) ← primary metric
- **obj_box_col**: planning collision rate % (lower = better) ← primary metric
- **car_ade / ped_ade**: agent motion ADE in meters (lower = better)
- **car_epa / ped_epa**: motion end-point accuracy (higher = better)
- **NDS**: nuScenes detection score (higher = better)
- **mAP**: detection mean AP (higher = better)
- **mAP_normal**: map prediction mAP (higher = better)

## Tunable Parameters

**Top-level variables** (simple reassignment appended to config):
```python
num_decoder = 6 # transformer decoder layers (2–8)
num_single_frame_decoder = 1
embed_dims = 256 # feature embedding dim (128/256)
num_groups = 8 # attention heads
drop_out = 0.1 # dropout (0–0.3)
num_epochs = 10
queue_length = 4 # temporal history frames (1–6)
fut_ts = 12 # motion future timesteps
ego_fut_ts = 6 # planning future timesteps
temporal = True
decouple_attn_motion = True
```

**Nested params** (dict mutation appended to config):
```python
optimizer['lr'] = 3e-4
model['depth_branch']['loss_weight'] = 0.2
model['head']['motion_plan_head']['motion_loss_cls']['loss_weight'] = 0.2
model['head']['motion_plan_head']['motion_loss_reg']['loss_weight'] = 0.2
model['head']['motion_plan_head']['plan_loss_cls']['loss_weight'] = 0.5
model['head']['motion_plan_head']['plan_loss_reg']['loss_weight'] = 1.0
model['head']['motion_plan_head']['plan_loss_status']['loss_weight'] = 1.0
model['head']['det_head']['loss_cls']['loss_weight'] = 2.0
model['head']['det_head']['loss_reg']['loss_box']['loss_weight'] = 0.25
```

## Experiment Loop

**NEVER STOP.** Once the loop begins, do NOT pause to ask whether to continue. The user may be away or asleep and expects you to run until manually stopped or max-experiments is reached. If you run out of obvious ideas, think harder — re-read prior results, try combining near-misses, try more radical changes. Keep going.

Each iteration:

### Step 1 — Propose
Based on the goal and all prior results in `research_log.md`, `results.tsv`, and `autoresearch/research_review.md`, decide what to change. Test ONE hypothesis per experiment (1–3 parameter changes). Explicitly state:
- What you're changing
- Why (what mechanism should improve the metric)
- What improvement you expect
- That this has NOT already been tried (cross-check `autoresearch/research_review.md` Section 3 and `research_log.md`)

**Hard constraints from prior work (never propose these — results are already in):**
- Do NOT remove map from both stages — planning catastrophically fails (L2: 0.600→6.61)
- Do NOT use pretrainv3/v4-style prediction pretraining — degrades all metrics
- Do NOT use separate head (sephead) — slightly worse across the board
- Do NOT reduce map learning rate — map_mAP collapses to ~0.07
- Do NOT add map head to stage2 when loaded from DN stage1 pretrain — L2 worsens to 0.700
- Do NOT increase motion_loss_reg or motion_loss_cls above 0.2 — large regression on both L2 and obj_box_col (mar25 exp002)
- Do NOT use rotation augmentation (rot3d_range) — hurts both L2 and obj_box_col (prior work)

### Step 2 — Create config
Config stem format: `auto_<tag>_exp{NNN}_{short_suffix}` (suffix: alphanumeric+underscore, ≤20 chars)

```bash
git checkout autoresearch/<tag> # ensure we're on the session branch
```

Use the **Write tool** to create `projects/configs/<config_stem>.py`:
- Copy the full base config content
- Append at the end:
```python

# === autoresearch overrides (<config_stem>) ===
log_config['hooks'][1]['init_kwargs']['name'] = '<config_stem>'
<your parameter changes>
```

### Step 3 — Commit, push, sync DGX
```bash
git add projects/configs/<config_stem>.py
git commit -m "autoresearch <tag> exp-NNN: <suffix>

<reason>"
git push -u origin autoresearch/<tag>
ssh trail_dgx "cd /raid/home/spapais/ForeSight && git fetch origin autoresearch/<tag> && git checkout autoresearch/<tag>"
```

### Step 4 — Submit
```bash
ssh trail_dgx "cd /raid/home/spapais/ForeSight && sbatch --export=ALL,WANDB_API_KEY=1cb0a37040ca089569cecda1c31722a24d56d3a4 scripts/dgx_run.sh bash ./tools/dist_train.sh projects/configs/<config_stem>.py 4 --deterministic"
```
Parse job ID from `Submitted batch job <ID>`. Record it immediately in `research_log.md`.

### Step 5 — Wait
Use the `/loop` skill to poll for job completion at the `--poll` interval:
```
/loop <poll> Check if SLURM job <JOB_ID> is done: ssh trail_dgx "squeue -j <JOB_ID> -h -o %T 2>/dev/null" — if the output is empty the job has finished; when done, continue the autoresearch loop by parsing metrics for config <config_stem> (tag <tag>, exp-NNN)
```
The loop will wake Claude every `--poll` interval. When the job leaves the queue, Claude continues automatically to Step 6.

### Step 6 — Parse metrics
```bash
LOG=$(ssh trail_dgx "ls -t /raid/home/spapais/ForeSight/work_dirs/<config_stem>/*.log 2>/dev/null | head -1")
ssh trail_dgx "cat $LOG" | grep -E "NDS|mAP|ade=|epa=|L2|obj_box_col|mAP_normal" | tail -30
```
If no work_dir log, fall back to SLURM log: `/raid/home/spapais/ForeSight/logs/foresight-<JOB_ID>.log`

### Step 7 — Log results

**Determine status:**
- `keep` — primary metrics (L2, obj_box_col) improved vs best so far
- `discard` — no improvement (but keep the branch — 4hr runs are worth recording)
- `crash` — job failed or no metrics found

**Append to `results.tsv`** (tab-separated):
```
<short-commit> <L2> <col%> <car_ade> <NDS> <status> <short description>
```

**Append to `research_log.md`:**
```markdown
## [exp-NNN] <config_stem> — <date>
**Hypothesis:** <what and why>
**Config changes:**
\`\`\`python
<appended lines>
\`\`\`
**Job ID:** <id>
**Status:** keep / discard / crash
**Metrics:**
| Metric | Baseline | Best so far | This exp | Δ vs best |
|--------|----------|-------------|----------|-----------|
| L2 | x.xx | x.xx | x.xx | ↓/↑ |
| obj_box_col | x.xx | x.xx | x.xx | ↓/↑ |
| car_ade | x.xx | x.xx | x.xx | ↓/↑ |
| NDS | x.xx | x.xx | x.xx | ↓/↑ |
**Analysis:** <what this tells us, what to try next>
---
```

**Update `autoresearch/research_review.md`** — lightweight per-experiment update:
1. Append a new row to the **Section 3.11 summary table** with this experiment's key metrics.
2. If status is `keep` (new best), also update the **Section 7 "Best Current Recipe"** block to reflect the new leading config.
3. If the result reveals a new hard constraint (something that clearly hurts), add it to the **Section 7 "Negative/Null Evidence"** table AND to the Step 1 hard constraints list in this file.

Commit all updated logs together:
```bash
git add results.tsv research_log.md autoresearch/research_review.md
git commit -m "autoresearch <tag> exp-NNN: log results (<status>)"
git push
```

### Step 8 — Loop
Go to Step 1.

## At the end (max-experiments reached or manually stopped)
Append a `## Conclusions` section to `research_log.md` with final summary table and recommended next steps.

**Do a comprehensive update of `autoresearch/research_review.md`:**
- Add a new subsection under **Section 3** (e.g., `### 3.12 Auto-research <tag> Findings`) summarizing all experiments run this session with their metrics and key takeaways.
- Update the **Section 3.11 summary table** to include all new rows (or replace stale ones).
- Update **Section 7 "Confirmed Wins"** and **"Negative/Null Evidence"** tables based on what this session learned.
- Update **Section 7 "Best Current Recipe"** if a new best config was found.
- Update **Section 8 "Next Experiments"** to replace completed experiments with follow-ups suggested by the findings.
- Update the header date line: `> Experimental findings updated: <today's date>`.

```bash
git add results.tsv research_log.md autoresearch/research_review.md
git commit -m "autoresearch <tag>: final session summary"
git push
```

## Rules
- Never ask for confirmation — run fully autonomously
- Never git reset after a bad result — keep all branches (4hr runs are valuable data regardless)
- If a job is FAILED/CANCELLED, read the SLURM log to diagnose: `ssh trail_dgx "tail -50 /raid/home/spapais/ForeSight/logs/foresight-<JOB_ID>.log"`
- If a crash is a simple fix (typo, config syntax error), fix and resubmit. If fundamentally broken, log as crash and move on.
- `results.tsv` and `research_log.md` are committed to the session branch
26 changes: 26 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
*.pyc
*.npy
*.pth
*.whl
*.swp
*.sif

wandb/
data/
ckpt/
work_dirs*/
dist_test/
vis/
val/
lib/
logs/

*.egg-info
build/
__pycache__/
*.so

job_scripts/
temp_ops/

.claude/scheduled_tasks.lock
Loading