Conversation
Move the Windows reparse-point FFI out of pna into cli so pna stays platform-dependency-free. Add PathnameEditor::edit_junction delegating to a helper shared with edit_symlink. Declare StoreAs::Junction as unconditional for exhaustive matching. Accept both absolute and relative stored targets at extract for forward compatibility. Correct detect_junction's error-kind mapping, switch to OsStr::encode_wide for non-lossy path encoding in create_junction, and fix the push branch name.
Codex adversarial review flagged that the plan instructed implementers to create cli/src/utils/os/windows/fs/mod.rs and edit cli/src/utils/os/windows/mod.rs, but the repository uses the sibling layout: cli/src/utils/os/windows.rs declares `pub(crate) mod fs;` and cli/src/utils/os/windows/fs.rs already holds FileHandle, chmod, lchown, open_read_metadata, and pub(crate) mod owner. Creating a new fs/mod.rs would conflict with fs.rs and strand those exports. Reparse and junction now live as additional submodules of the existing fs module: Task 1.1 adds `pub(crate) mod reparse;` to windows/fs.rs and creates windows/fs/reparse.rs; Task 1.4 adds `pub(crate) mod junction;` and creates windows/fs/junction.rs. The existing contents of windows/fs.rs are left untouched.
Codex adversarial review flagged that `DataKind::HardLink + fLTP=Directory` (junction or non-Windows fallback symlink) still went through the default `restore_metadata()` path in extract.rs, which would chmod/chown/ACL/xattr/ fflag the junction via follow-link syscalls and therefore mutate the external directory outside the extraction root. Option (A) from the review discussion closes the hole by bypassing `restore_metadata()` for junction entries and applying only no-follow timestamp restoration via a new `restore_link_timestamps_no_follow` helper. A `TODO: junction-aware no-follow metadata (deferred follow-up)` block on that helper and a new Out-of-Scope entry document the full option (C) path for a follow-up plan. Task 4.4 locks the invariant with a security regression test (T19) that pre-sets a distinctive mode on the external target, runs extract with `--keep-permission --allow-unsafe-links`, and asserts the external mode is unchanged. Self-review, risk summary, testing matrix and type-consistency sections are updated accordingly.
Codex adversarial review flagged that `detect_junction`'s `raw_os_error() == Some(4390)` check against ERROR_NOT_A_REPARSE_POINT would never fire because `read_reparse_point` and `create_junction` round-trip `windows::core::Error` through `io::Error::from_raw_os_error(e.code().0)`. The windows crate encodes Win32 failures as HRESULT via `HRESULT_FROM_WIN32` (0x80070000 | code), so the stored raw_os_error is the HRESULT form (e.g. 0x80071126), not the canonical Win32 code (4390). Introduce `io_error_from_win32` in the reparse module. It inspects the HRESULT facility bits and, when FACILITY_WIN32 is set, extracts the low 16 bits as the Win32 code; other HRESULTs pass through as raw i32. Every `CreateFileW` / `DeviceIoControl` error path in read_reparse_point and create_junction now goes through this helper, so downstream consumers such as `detect_junction` see Microsoft-documented Win32 codes. Adds two pure-Rust unit tests that verify the FACILITY_WIN32 extraction and the non-Win32 pass-through, so the helper is regression-fenced without needing a Windows CI round-trip.
Writing-plans skipped the brainstorming→spec step and produced a plan directly, with all design decisions iterated inside the plan through grill and Codex adversarial review rounds. This commits the spec retroactively so architectural intent lives separately from task-level steps. Captures locked decisions (HardLink+fLTP=Directory encoding, cli-only module placement, PathnameEditor::edit_junction with shared helper, Option A metadata safety with early return, HRESULT→Win32 error translation, non-UTF-8 invariant from UTF-16 gate) alongside the rejected alternatives (cases B'/C', pna-crate module placement, option B/C metadata approaches, direct edit_symlink reuse, HRESULT-form error matching) and the deferred follow-ups (relative-path optimization, full junction-aware no-follow metadata, public reparse-point API).
The spec `docs/superpowers/specs/2026-04-19-windows-junction-support-design.md` now owns the WHY — problem, architecture, interfaces, invariants, rejected alternatives, follow-ups. The plan keeps the HOW — Task 1.1 through Task 6.2 with concrete code snippets, commit commands, and verification steps. Removed duplicated sections from the plan (Goal, Architecture, Tech Stack narrative, full In-scope/Out-of-scope list, Risk Summary table, Testing Matrix detailed rows, File Structure tables) and replaced them with short cross-references plus plan-specific anchors: where each of T1-T19 lands, which task enforces which safety invariant, which files are touched. The Task List and its embedded implementation details are untouched.
ASCII lexicographic order places Win32_System_IO before Win32_System_Ioctl
('O' 0x4F < 'o' 0x6F). The plan originally placed them in the opposite
order, which the Task 1.1 code-quality review flagged as the only
sortable-by-ASCII anomaly in the list. Swap the two entries in both
Cargo.toml and the plan so the list is strictly sorted — the stable-diff
rationale the plan cites now actually holds.
No functional impact; feature ordering does not affect compilation.
The Task 1.1 code-quality review flagged that the IO_REPARSE_TAG_SYMLINK arm of parse_reparse_buffer shipped without direct unit coverage. Add three tests and the sample_symlink_buffer helper that exercises the symlink layout (with the extra 4-byte Flags word): one for an absolute target where \\??\\ is stripped, one for a relative target where the raw string is preserved, and one for a truncated buffer that should error with InvalidData before the flags word is read. Updates the plan's Task 1.1 Step 3 test module to mirror the final reparse.rs file so plan and implementation stay in sync.
The Task 1.2 code-quality review flagged three consistency gaps against the rest of the crate's Windows FFI: - Reuse `crate::utils::str::encode_wide`, which rejects embedded NUL bytes before null-terminating. The previous inline `encode_wide().collect() + push(0)` would let a path with an embedded NUL reach CreateFileW and open a silently-truncated different file. All nine sibling call sites in `cli/src/utils/os/windows/` already use this helper. - Use `PCWSTR::from_raw` instead of the tuple-struct constructor, again matching the nine sibling call sites. - Downgrade `read_reparse_point` from `pub` to `pub(crate)` to match the spec §5 commitment that these primitives stay CLI-private. Promotion into the `pna` crate behind a `windows-fs` feature is captured as an explicit follow-up in spec §11. Plan Task 1.2 Step 4 and spec §5 are updated to mirror the source so they don't drift again.
The Task 1.3 code-quality review flagged a silent `u16` truncation pitfall: the reparse-buffer header stores SubstituteName and PrintName byte lengths as `u16`, and `create_junction` was casting `(wide.len() * 2) as u16` plus computing `data_len = 8 + subst + print` as `u16` — a path exceeding ~32 KB of UTF-16 bytes would silently wrap and emit a malformed reparse buffer. The 16 KB reparse-buffer kernel cap acts as a backstop, but relying on that to backstop silent corruption is weaker than the helper's contract should be. Move the SubstituteName / PrintName construction and the u16 overflow check ahead of `std::fs::create_dir`, so an oversized target now returns `InvalidInput` at the helper boundary without ever leaving a half-created junction directory behind. Use `checked_add` for the 8-byte offset-header total so the check also catches the 32_760-plus-offset-header wrap case. Plan Task 1.3 Step 4 mirrors the reorder.
The Task 2.1 code-quality review flagged two stylistic gaps against the file's existing conventions: - The four new unit tests used an `edit_junction_*` prefix while every other test in `cli/src/command/core/path.rs` uses an `editor_*` prefix. Rename to `editor_junction_*` so the test list stays consistently grep-able. - The `edit_junction` doc comment captured the design rationale (why a separate method exists) but not the contract (substitutions applied; leading `/` and --strip-components NOT applied). Mirror the contract lines from `edit_symlink` so the two methods read as siblings, not as rationale + cross-reference.
Windows CI revealed that every `create_junction_round_trip` run was hitting `ERROR_INVALID_REPARSE_DATA` (4392) — the kernel rejected our IO_REPARSE_TAG_MOUNT_POINT buffer. Microsoft's REPARSE_DATA_BUFFER contract requires both SubstituteName and PrintName inside PathBuffer to be UTF-16 NUL-terminated, with the length fields excluding the NULs and PrintNameOffset set after SubstituteNameLength plus the NUL. Our writer omitted both terminators and pointed PrintNameOffset at `subst_bytes_len` (should be `subst_bytes_len + 2`). The parser never noticed because it slices by offset/length and never reads the NULs, so `read_reparse_point_on_junction` (which reads a `mklink /J`-built junction) passed all along while `create_junction_round_trip` failed. Append a u16 0x0000 after each name, bump `PrintNameOffset` by 2, and make `ReparseDataLength` account for the four extra NUL bytes. The u16-overflow guard now also checks the two extra NUL bytes so oversized targets are still caught before any filesystem mutation. Plan Task 1.3 Step 4 mirrors the fix.
The Task 3.1 code-quality review flagged that the `StoreAs::Junction` arm in `create_entry` built the `EntryReference` directly via `EntryReference::from_utf8_preserve_root(target.to_str().unwrap())`, bypassing the `PathnameEditor::edit_junction` method added in Task 2.1. The consequence was that user `-s` / `--transform` substitutions applied to symlink targets but silently NOT to junction targets, and `edit_junction` itself was dead code on every platform. Call `pathname_editor.edit_junction(target)` instead. The method applies substitutions and returns an `EntryReference` via the shared helper that `edit_symlink` also uses, so junction and symlink targets behave consistently under user transforms. Invariant I1 (spec §7) guarantees the target is valid UTF-8, so the helper's internal `to_string_lossy` is effectively lossless for real inputs. Updates spec §6 / §8 so the "load-bearing unwrap" paragraph is replaced with the `edit_junction` rationale, and plan Task 3.1 Step 7 mirrors the final code. Remove the now-unused `EntryReference` import from `cli/src/command/core.rs`.
Address all four Minor nits flagged by the Task 4.1 code-quality
review; none affect Invariant I2 correctness.
- Junction skip warning: drop the internal encoding term
"HardLink+fLTP=Directory" and match the peer symlink/hardlink voice
("If you need to extract it, use `--allow-unsafe-links`.").
- Non-Windows fallback warning: demote to `log::debug!`. The per-entry
`warn!` would spam on large Windows-origin archives extracted on
Unix; the degradation (junction → symlink) is a known platform
limitation, best surfaced at debug level.
- `create_junction_or_fallback` canonicalize: replace `unwrap_or` with
an explicit match that logs the underlying error at `debug!` level
before falling back to the raw join. Prevents permission-denied /
ELOOP failures from being hidden behind `create_junction`'s generic
"target must be absolute" message.
- Test pattern: switch `extract_junction_without_allow_unsafe_links_skips`
to the peer CLI test convention (`utils::setup()` + relative
`"{base}/{base}.pna"` paths + in-process `cli::Cli::try_parse_from`).
Drops the subprocess invocation and the macOS `canonicalize`
workaround, and aligns with existing peer tests in
`cli/tests/cli/extract/`. The stderr assertion is dropped because
the filesystem assertion ("link_dir does not exist") is the
load-bearing check.
Plan Task 4.1 Step 2 and Step 4 mirror the final code.
CI surfaced that `extract_junction_with_allow_unsafe_links_creates_link` panics under `wasm32-wasip1` with "Operation not permitted (os error 63)" because wasi-preview1 does not expose symlink creation. The fallback path `pna::fs::symlink` therefore fails when the test tries to materialize the non-Windows junction-fallback symlink. Add `#[cfg(not(target_family = "wasm"))]` to the test, matching the pattern established by `cli/tests/cli/extract/hardlink.rs:95` and peers which gate off WASM for the same reason. The skip-case test (`extract_junction_without_allow_unsafe_links_skips`) is not gated because it never reaches the symlink-creation path. Plan Task 4.2 Step 1 code block mirrors the gate.
Match the neighboring `extract_junction_with_allow_unsafe_links_creates_link` test's `let ft = meta.file_type();` hoist, avoiding three separate `file_type()` calls inside a single `assert!`. Pure style cleanup flagged by the Task 4.3 code-quality review.
The Task 4.4 spec-compliance review empirically proved the I2 security fence was decorative: a regression that swaps `restore_link_timestamps_no_follow` for the default `restore_metadata` call slipped through the `extract_junction_does_not_mutate_external_target` assertion. Root cause: `build_junction_fixture` produced a junction entry with no Permission chunk, so `restore_metadata`'s chmod/chown branch (gated on `item.metadata().permission().is_some()`) never entered — no follow-link syscall ever touched the target, no matter what the extract code did. Extend the fixture helper surface: - `build_junction_fixture_with_permission(target, mode)` stamps a Permission chunk with the given mode so extract's `restore_mode` actually fires when `--keep-permission` is passed. - `build_junction_fixture` continues to produce a Permission-less entry for the other two tests that don't exercise the mode path. - Both delegate to a shared `build_junction_fixture_with_optional_permission` to avoid duplication. Retarget the I2 test to `build_junction_fixture_with_permission(..., 0o755)` while still pre-setting the external directory to 0o700. Under a regression that removes Task 4.1's early return, extract now calls `chmod(link, 0o755)` which follows the symlink and changes the external target's mode to 0o755, firing `baseline_perms != after_perms` with the "I2 violation" message. Empirically verified by reverting Task 4.1's early return in a throwaway experiment: the fence fires with `left 0o040700, right 0o040755`. With the early return restored, the fence passes. Plan Task 4.4 Step 1 code block mirrors the new helper surface.
Windows CI surfaced that `extract_junction_does_not_mutate_external_target` panics with `MissingRequiredArgument: --unstable` on the `--no-default-features` build of the `extract` subcommand. On Windows, `--keep-permission` (and related metadata-restoration flags) is gated behind `--unstable` per `cli/src/command/extract.rs:149-150` (`#[cfg_attr(windows, arg(requires = "unstable", ...))]`). Add `#[cfg(windows)] "--unstable"` to the test's CLI args, matching the peer pattern in `cli/tests/cli/keep_all.rs` and `cli/tests/cli/diff/metadata.rs`. Plan mirrored.
The extract path gates Windows junction creation on --allow-unsafe-links (see `cli/src/command/extract.rs` junction arm), but the flag's help text still advertised only "symbolic links and hard links". A user reading `pna extract --help` or `pna compat bsdtar --help` today would not know the flag also covers NTFS junctions. Rewrite the four help strings (positive/negative x extract/bsdtar) to list symbolic links, hard links, and Windows junctions together, and regenerate the committed CLI reference so docs stay in sync.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces support for Windows NTFS junctions, enabling them to be detected during archive creation and restored during extraction. Junctions are represented using the PNA encoding for hard links with a directory target type. The implementation includes new Windows-specific utilities for reparse point manipulation and updates to the CLI's path editing and extraction logic. Review feedback identified a bug in the junction creation logic involving incorrect NT namespace prefixing and an invalid string literal, as well as a potential extraction failure when canonicalizing targets that do not yet exist.
| let mut subst_wide: Vec<u16> = r"\??\".encode_utf16().collect(); | ||
| subst_wide.extend(target.as_os_str().encode_wide()); | ||
| let print_wide: Vec<u16> = target.as_os_str().encode_wide().collect(); |
There was a problem hiding this comment.
The target path provided to create_junction may come from std::fs::canonicalize, which on Windows prepends the verbatim prefix \\?\. If this prefix is present, prepending \??\ results in an invalid NT namespace path like \??\\\?\C:\.... Additionally, the raw string literal r"\??\" used in the original code is invalid in Rust because raw strings cannot end with an odd number of backslashes.
You should strip the \\?\ prefix (if present) from the target path and use a valid string literal for the NT prefix.
| let mut subst_wide: Vec<u16> = r"\??\".encode_utf16().collect(); | |
| subst_wide.extend(target.as_os_str().encode_wide()); | |
| let print_wide: Vec<u16> = target.as_os_str().encode_wide().collect(); | |
| let mut target_wide: Vec<u16> = target.as_os_str().encode_wide().collect(); | |
| if target_wide.starts_with(&[92, 92, 63, 92]) { | |
| target_wide.drain(0..4); | |
| } | |
| let mut subst_wide: Vec<u16> = "\\??\\".encode_utf16().collect(); | |
| subst_wide.extend(&target_wide); | |
| let print_wide = target_wide; |
| match std::fs::canonicalize(&joined) { | ||
| Ok(canon) => canon, | ||
| Err(e) => { | ||
| log::debug!( | ||
| "Failed to canonicalize junction target {}: {}; using raw join", | ||
| joined.display(), | ||
| e | ||
| ); | ||
| joined | ||
| } | ||
| } |
There was a problem hiding this comment.
Using std::fs::canonicalize here will fail if the junction target does not yet exist on disk, which is common when extracting an archive containing both a directory and a junction pointing to it. If canonicalize fails, the code falls back to joined, which might be a relative path. However, create_junction (in reparse.rs) explicitly rejects relative targets.
Consider using std::path::absolute to ensure the path is absolute without requiring it to exist on disk, while still attempting canonicalize for a cleaner path if possible.
let absolute = std::path::absolute(&joined).unwrap_or_else(|_| joined.clone());
match std::fs::canonicalize(&absolute) {
Ok(canon) => canon,
Err(e) => {
log::debug!(
"Failed to canonicalize junction target {}: {}; using absolute path",
absolute.display(),
e
);
absolute
}
}
No description provided.