feat: surface invalid trajectory files via status -v and trail doctor#26
Conversation
Reconcile silently skipped files that failed to load, leaving users with no path to identify or repair them. ReconcileSummary now carries per-file failures (path + reason + first validation error). The new `trail doctor` command lists them and `--quarantine` moves them into .trajectories/invalid/ so reconcile stops scanning. `trail status -v` prints the same diagnostic inline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR adds per-file reconcile failure tracking and caching in storage, a quarantine routine to move non-IO invalid trajectories, a ChangesDiagnostic Failure Tracking and Quarantine System
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1ba2391d73
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const dest = join(targetDir, basename(failure.path)); | ||
| try { | ||
| await rename(failure.path, dest); |
There was a problem hiding this comment.
Avoid overwriting quarantined files with matching names
When invalid files from different scanned directories share the same basename (for example active/traj.json and completed/2026-04/traj.json, or two legacy monthly folders with the same malformed name), this destination collapses them into one .trajectories/invalid/<basename> path. rename will replace an existing file at that path on typical filesystems, so trail doctor --quarantine can silently lose one of the invalid trajectory files instead of preserving both for inspection. Use a collision-safe destination or preserve the relative path under invalid/.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/storage/file.ts`:
- Around line 382-386: The quarantine loop using basename(failure.path) before
rename() can cause collisions when files from different source directories share
names (e.g., failure objects in candidates), risking silent overwrite; modify
the logic in the loop that builds dest (inside the for..of over candidates) to
ensure uniqueness by adding a disambiguating component (for example include a
sanitized directory prefix, a timestamp/UUID, or incremental suffix) or check
for existence and choose a new name before calling rename(failure.path, dest),
and update moved.push(failure) only after a successful, non-colliding rename;
ensure you reference the same symbols (candidates, failure.path, targetDir,
basename, rename, moved) when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: e57ea9a5-3c29-4601-adf3-b71cc04cb2b1
📒 Files selected for processing (4)
src/cli/commands/doctor.tssrc/cli/commands/index.tssrc/cli/commands/status.tssrc/storage/file.ts
| await mkdir(targetDir, { recursive: true }); | ||
| const moved: ReconcileFailure[] = []; | ||
| for (const failure of candidates) { | ||
| const dest = join(targetDir, basename(failure.path)); |
There was a problem hiding this comment.
🟡 Quarantine overwrites files when two failing files share the same basename
quarantineInvalid() uses basename(failure.path) to construct the destination path in the invalid/ directory. If two failing files from different source directories share the same basename (e.g., active/traj_abc.json and completed/2024-01/traj_abc.json), the second rename() at line 385 will silently overwrite the first file already moved into invalid/. This is data loss, contradicting the explicit "never silently lose a file" comment on line 388. The scenario is possible when the same trajectory file exists in both active/ and a completed/ subdirectory and both copies are malformed or schema-invalid.
Prompt for agents
In quarantineInvalid(), the destination path is computed as join(targetDir, basename(failure.path)), which can collide when two failing files from different directories share the same basename. To prevent silent data loss, either: (1) derive a unique destination name that encodes the source subdirectory (e.g., replace path separators with underscores), (2) check if the destination already exists and append a numeric suffix, or (3) use the full relative path from trajectoriesDir as the destination, recreating the subdirectory structure under invalid/. Option 3 is cleanest: replace basename(failure.path) with the relative path from this.trajectoriesDir, and mkdir the parent under targetDir before renaming.
Was this helpful? React with 👍 or 👎 to provide feedback.
Codex, CodeRabbit, and Devin all flagged the same collision risk: basename() collapses two invalid files that share a name across active/ and completed/ onto the same invalid/ path, and rename() silently overwrites the first one. quarantineInvalid now mirrors the relative path under invalid/ so both copies survive, with a numeric suffix fallback if the destination still collides with a previous quarantine run. Also formats package.json per biome to unblock the lint check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/storage/reconcile-real-data.test.ts`:
- Around line 237-240: The test currently only asserts the original file was
removed from active/ using the variable activeAfter; add a corresponding check
for completed/2026-04 by reading its directory (e.g., a new completedAfter
variable via readdir(join(trajRoot, "completed", "2026-04"))) and assert
completedAfter does not contain "traj_dup00000_0000.json" so both active and
completed removals are verified in tests/storage/reconcile-real-data.test.ts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 57489f0f-e86e-4134-bd83-007c2bd79bd3
📒 Files selected for processing (3)
package.jsonsrc/storage/file.tstests/storage/reconcile-real-data.test.ts
✅ Files skipped from review due to trivial changes (1)
- package.json
🚧 Files skipped from review as they are similar to previous changes (1)
- src/storage/file.ts
| // And the originals must be gone from active/ and completed/. | ||
| const activeAfter = await readdir(join(trajRoot, "active")); | ||
| expect(activeAfter).not.toContain("traj_dup00000_0000.json"); | ||
| }); |
There was a problem hiding this comment.
Assert source removal in completed/2026-04 as well
The intent says both originals are removed, but only active/ is checked (Line 238). Please also assert that completed/2026-04/traj_dup00000_0000.json is gone.
Patch suggestion
// And the originals must be gone from active/ and completed/.
const activeAfter = await readdir(join(trajRoot, "active"));
expect(activeAfter).not.toContain("traj_dup00000_0000.json");
+ const completedAfter = await readdir(join(trajRoot, "completed", "2026-04"));
+ expect(completedAfter).not.toContain("traj_dup00000_0000.json");
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // And the originals must be gone from active/ and completed/. | |
| const activeAfter = await readdir(join(trajRoot, "active")); | |
| expect(activeAfter).not.toContain("traj_dup00000_0000.json"); | |
| }); | |
| // And the originals must be gone from active/ and completed/. | |
| const activeAfter = await readdir(join(trajRoot, "active")); | |
| expect(activeAfter).not.toContain("traj_dup00000_0000.json"); | |
| const completedAfter = await readdir(join(trajRoot, "completed", "2026-04")); | |
| expect(completedAfter).not.toContain("traj_dup00000_0000.json"); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/storage/reconcile-real-data.test.ts` around lines 237 - 240, The test
currently only asserts the original file was removed from active/ using the
variable activeAfter; add a corresponding check for completed/2026-04 by reading
its directory (e.g., a new completedAfter variable via readdir(join(trajRoot,
"completed", "2026-04"))) and assert completedAfter does not contain
"traj_dup00000_0000.json" so both active and completed removals are verified in
tests/storage/reconcile-real-data.test.ts.
Summary
ReconcileSummarynow carries afailures: ReconcileFailure[]array (path + reason + first validation error, pre-rendered).trail doctorcommand lists invalid trajectories and--quarantinemoves them into.trajectories/invalid/so reconcile stops scanning them. Only parse/schema failures are quarantined; transientio_errorfailures are left in place.trail status -v/--verboseprints the same diagnostic inline so the existing `reconciled X/Y, invalid: N` warning has an actionable next step.Test plan
🤖 Generated with Claude Code