Skip to content

MAINT: Standardize path handling on pathlib#1815

Merged
romanlutz merged 3 commits into
microsoft:mainfrom
romanlutz:romanlutz/standardize-pathlib-path
May 27, 2026
Merged

MAINT: Standardize path handling on pathlib#1815
romanlutz merged 3 commits into
microsoft:mainfrom
romanlutz:romanlutz/standardize-pathlib-path

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

Summary

Standardizes path handling on pathlib.Path across four files, eliminating stray os.path.* usage. Per the general rule, Path is used end-to-end and only round-tripped to str at API boundaries that genuinely require strings (markdown link hrefs, display strings, etc.).

Files changed

  • pyrit/backend/routes/media.py ⚠️ security-sensitive — see note below.
  • pyrit/output/conversation/markdown.py_format_link_path, _maybe_blur_image_on_disk, _blurred_destination. Kept import os because os.PathLike (type hints), os.fspath, os.getpid, and os.replace are still used.
  • pyrit/backend/mappers/attack_mappers.py — two sites (_resolve_media_url, _build_filename); removed import os, added from pathlib import Path.
  • pyrit/models/storage_io.py — routed path_exists, is_file, and create_directory_if_not_exists through the existing _convert_to_path helper and replaced os.path.exists / os.path.isfile / os.makedirs with Path methods; removed import os.

pyrit/datasets/jailbreak/text_jailbreak.py was intentionally left alone — the str(template_path) calls there are deliberate boundary conversions for the template_source: str field used in logging and error messages.

⚠️ Security-sensitive change in media.py — please review carefully

The path-traversal protection in _validate_media_path was rewritten to use pathlib. Symlink-resolution semantics are preserved:

  • os.path.realpath(path)Path(path).resolve(strict=False). Both resolve symlinks and normalize .. components; strict=False matches realpath's behavior for non-existent paths.
  • The previous prefix-string check (startswith(allowed_root + os.sep)) was replaced with real_path.relative_to(allowed_root). relative_to raises ValueError when real_path is not under allowed_root, which is now caught and converted to the same HTTPException 403 as before. This is equivalent to (and arguably stricter than) the old prefix check — it does not allow sibling directory names that happen to share a prefix.
  • _, ext = os.path.splitext(real_path)real_path.suffix.
  • The function now takes/returns Path instead of str (private helper; the only caller was updated).

A new symlink-escape test was added (test_rejects_symlink_pointing_outside_results) that creates a symlink under an allowed subdirectory pointing outside results_path and asserts the request is rejected with 403. The test skips on platforms where the user can't create symlinks (Windows without developer mode / admin).

The existing ..-escape test (test_rejects_path_traversal) still passes unchanged.

Behavior note: _format_link_path in markdown.py

os.path.relpath(path) was replaced with Path(path).relative_to(Path.cwd()) plus a ValueError fallback to str(Path(path).resolve()). This is a small behavior change: os.path.relpath produces ../../...-style relative paths when the target is not under cwd, whereas relative_to raises and we fall back to the absolute path. The output is still a valid path for the markdown link href — just absolute instead of relative-with-dotdots when the image lives outside the current working directory.

The _expected_link helper in tests/unit/output/test_blur_images.py was updated to mirror the new behavior, and test_markdown_format_image_content_handles_cross_drive_path now patches Path.relative_to instead of os.path.relpath. Test patches in tests/unit/models/test_storage_io.py were similarly updated to target pathlib.Path methods instead of os.path.* / os.makedirs.

Verification

  • uv run pytest tests/unit -q -n 48049 passed, 119 skipped.
  • uv run pre-commit run --files <changed files> → all hooks pass.
  • grep "os\.path\." pyrit/backend/routes/media.py pyrit/output/conversation/markdown.py pyrit/backend/mappers/attack_mappers.py pyrit/models/storage_io.py → zero hits.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread tests/unit/backend/test_media_route.py
Comment thread tests/unit/backend/test_media_route.py
Comment thread pyrit/output/conversation/markdown.py
Comment thread tests/unit/output/test_blur_images.py Outdated
Copy link
Copy Markdown
Contributor

@jsong468 jsong468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small NITS, thanks!

romanlutz and others added 2 commits May 26, 2026 16:31
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz enabled auto-merge May 26, 2026 23:35
@romanlutz romanlutz added this pull request to the merge queue May 26, 2026
Merged via the queue into microsoft:main with commit ba02626 May 27, 2026
48 checks passed
@romanlutz romanlutz deleted the romanlutz/standardize-pathlib-path branch May 27, 2026 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants