Conversation
There was a problem hiding this comment.
Pull request overview
This PR consolidates all Flash local state under the .flash/ directory, eliminating the previous split paradigm where some state was stored in .runpod/ and other state in .flash/. This is a clean refactoring that improves consistency and simplifies the directory structure for the project.
Changes:
- Updated all source code references from
.runpod/to.flash/for resource state files and container archives - Removed
.runpod/from ignore patterns across all configuration files and templates - Renamed test fixtures and updated test paths to reflect the new directory structure
- Updated all documentation references to use
.flash/resources.pklinstead of.runpod/resources.pkl
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/runpod_flash/core/resources/resource_manager.py |
Changed RUNPOD_FLASH_DIR constant from .runpod to .flash |
src/runpod_flash/cli/commands/run.py |
Updated _RESOURCE_STATE_FILE path from .runpod/resources.pkl to .flash/resources.pkl |
src/runpod_flash/cli/commands/preview.py |
Changed CONTAINER_ARCHIVE_PATH from /root/.runpod/artifact.tar.gz to /root/.flash/artifact.tar.gz |
src/runpod_flash/cli/utils/ignore.py |
Removed .runpod/ from the always_ignore patterns list |
src/runpod_flash/cli/commands/undeploy.py |
Updated docstring reference from .runpod/resources.pkl to .flash/resources.pkl |
.gitignore |
Consolidated ignore patterns to only use .flash/ instead of separate .runpod/ and .flash/logs/ entries |
src/runpod_flash/cli/utils/skeleton_template/.gitignore |
Removed .runpod/ from template ignore patterns |
src/runpod_flash/cli/utils/skeleton_template/.flashignore |
Changed .runpod/ to .flash/ in flashignore patterns |
tests/conftest.py |
Renamed worker_runpod_dir fixture to worker_flash_dir and updated all path references |
tests/unit/resources/test_resource_manager.py |
Updated mock resource file path from .runpod/resources.pkl to .flash/resources.pkl |
tests/unit/cli/commands/build_utils/test_scanner.py |
Removed test for .runpod directory exclusion (now redundant with existing .flash test) |
src/runpod_flash/cli/docs/flash-undeploy.md |
Updated all documentation references from .runpod/resources.pkl to .flash/resources.pkl |
src/runpod_flash/cli/docs/flash-run.md |
Updated documentation references from .runpod/resources.pkl to .flash/resources.pkl |
docs/Flash_Deploy_Guide.md |
Updated all technical documentation references from .runpod/resources.pkl to .flash/resources.pkl |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
93543f4 to
3fa85d8
Compare
QA ReportStatus: PASS Targeted Test Results
Full Suite Results
All 12 failures are in files NOT touched by this PR and reproduce on the base branch. CI passes across all 5 Python versions (3.10-3.14). Migration Completeness
PR Diff AnalysisFiles changed: 14 (7 source, 3 docs, 2 templates, 2 test files) What changed:
Backward compatibility note: There is no migration logic for users who have an existing
This is acceptable for a development tool where re-deploy is cheap, but worth noting in release notes. RecommendationMERGE — Clean migration with no regressions. All directory path references are consistently updated. The lack of backward compatibility for Generated by flash-qa agent |
QA Report — PR #221: Migrate
|
| Check | Result |
|---|---|
ruff format --check |
PASS (1290 files formatted) |
ruff check |
PASS (all checks passed) |
Test Results
| Suite | Passed | Failed | Skipped | Result |
|---|---|---|---|---|
Full (tests/ -n 4) |
1239 | 14 | 1 | See below |
| Changed files only | 60 | 0 | 0 | PASS |
All 14 failures are pre-existing (not introduced by this PR):
| Test File | Failures | Status |
|---|---|---|
test_class_execution_integration.py |
2 | P4 known — _class_type / _constructor_args AttributeError |
test_remote_concurrency.py |
10 | P4 known — asyncio.gather unhashable/not-awaitable TypeError |
test_file_locking.py |
2 | Pre-existing flaky — platform-dependent file lock timing |
Zero new failures introduced by this PR.
Flaky Test Check
test_resource_manager.py — 3/3 runs passed (19/19 tests each). No flakiness detected.
Coverage
| Metric | Value |
|---|---|
| Overall coverage | 72.27% |
| Threshold | 65% |
| Status | PASS |
No coverage regression from this PR.
PR Diff Analysis
Completeness Checklist
| Item | Status | Notes |
|---|---|---|
All .runpod/ dir refs replaced in src/ |
PASS | Zero remaining .runpod/ path refs in source |
All .runpod/ dir refs replaced in tests/ |
PASS | Zero remaining .runpod/ path refs in tests |
All .runpod/ dir refs replaced in docs |
PASS | Flash_Deploy_Guide.md updated correctly |
.gitignore updated |
PASS | .runpod/ removed, .flash/ covers everything |
| Skeleton templates updated | PASS | Both .flashignore and .gitignore templates updated |
ignore.py always-ignore list updated |
PASS | .runpod/ entry removed (.flash/ already present) |
resource_manager.py constant updated |
PASS | RUNPOD_FLASH_DIR = Path(".flash") |
run.py state file path updated |
PASS | _RESOURCE_STATE_FILE = Path(".flash") / "resources.pkl" |
preview.py container archive path updated |
PASS | CONTAINER_ARCHIVE_PATH = "/root/.flash/artifact.tar.gz" |
undeploy.py docstring updated |
PASS | References .flash/resources.pkl |
conftest.py fixture names/paths updated |
PASS | worker_runpod_dir -> worker_flash_dir |
| Scanner test updated | PASS | test_exclude_runpod_directory removed (redundant with test_exclude_flash_directory) |
| Test resource manager paths updated | PASS | tmp_path / ".flash" / "resources.pkl" |
Findings
-
No migration/backward compatibility: The PR explicitly states "No legacy/migration handling." Users with existing
.runpod/resources.pklwill lose their cached resource state. This is acceptable for a development-time cache (resources can be re-provisioned), but could cause a one-time re-deployment for users upgrading. This should be noted in release notes. -
Container archive path change (
preview.py): Changed from/root/.runpod/artifact.tar.gzto/root/.flash/artifact.tar.gz. This is only used byflash deploy --preview(Docker Compose local preview). The flash-worker Docker images that handle production deployment do NOT reference this path — they useFLASH_*environment variables. No cross-repo impact. -
Remaining
.runpodstrings are all correct: All remaining matches are RunPod API URLs (api.runpod.ai,console.runpod.io,docs.runpod.io) or Python module imports (from runpod_flash.core.api.runpod). These are unrelated to the.runpod/directory. -
Test removed (
test_exclude_runpod_directory): This test verified that the scanner excluded.runpod/directories. Since.runpod/is no longer a project directory, this test is correctly removed. The equivalenttest_exclude_flash_directorytest remains and passes. -
No anti-patterns detected: No bare except, no print statements, no mutable defaults, no hardcoded secrets, no missing awaits.
Known Issue Check
| Known Issue | Affected? |
|---|---|
P4: test_class_execution_integration.py failures |
Pre-existing, not related to PR |
P4: test_remote_concurrency.py failures |
Pre-existing, not related to PR |
test_file_locking.py flaky tests |
Pre-existing, not related to PR |
Recommendation
APPROVE — Clean migration with complete coverage of all .runpod/ directory references. All changed tests pass consistently. No new failures, no coverage regression, no lint/format issues. The only consideration is documenting the lack of backward migration in release notes for users with existing .runpod/ state directories.
d6e172d to
af7dffc
Compare
runpod-Henrik
left a comment
There was a problem hiding this comment.
Code Review — PR #221: .runpod/ → .flash/ migration
Large refactor with significant overlap with PRs #220 and #235. Found three high-severity issues:
Bug 1 (HIGH): Authorization header sent to presigned S3 URLs — breaks upload AND download
app.py — upload_build() and download_tarball()
Both methods switched from hand-crafted requests calls (no Authorization) to get_authenticated_requests_session(), which injects Authorization: Bearer <key> as a default session header.
# OLD — no auth sent to S3 presigned URL
headers = {"User-Agent": get_user_agent(), "Content-Type": TARBALL_CONTENT_TYPE}
resp = requests.put(url, data=fh, headers=headers)
# NEW — auth header sent to S3 presigned URL
with get_authenticated_requests_session() as session:
session.headers["Content-Type"] = TARBALL_CONTENT_TYPE
resp = session.put(url, data=fh) # sends Authorization: Bearer ... to S3AWS S3 rejects requests with both query-string signature AND Authorization header → HTTP 400 InvalidArgument: "Only one auth mechanism allowed". Build uploads and artifact downloads will fail for all users.
Fix: Use a plain session (no auth) for presigned URL requests, or add include_auth=False parameter to get_authenticated_requests_session().
Bug 2 (HIGH): No migration for existing .runpod/resources.pkl — users lose all endpoint state
resource_manager.py
RUNPOD_FLASH_DIR = Path(".flash") # was Path(".runpod")No migration code. Existing .runpod/resources.pkl with tracked endpoints is silently orphaned. flash undeploy lists nothing. Endpoints keep running and billing on RunPod with no CLI management.
Fix: Add migration on first startup:
_LEGACY_STATE_FILE = Path(".runpod") / "resources.pkl"
if not RESOURCE_STATE_FILE.exists() and _LEGACY_STATE_FILE.exists():
RUNPOD_FLASH_DIR.mkdir(parents=True, exist_ok=True)
shutil.copy2(_LEGACY_STATE_FILE, RESOURCE_STATE_FILE)
log.info("Migrated state from .runpod/ to .flash/")Bug 3 (HIGH): Per-request API key propagation from LB to workers silently removed
Deleted: api_key_context.py, extract_api_key_middleware in lb_handler.py
The middleware extracted the caller's API key from Authorization: Bearer header and propagated it to downstream QB worker calls via ContextVar. Now removed — worker calls fall back to the server-process RUNPOD_API_KEY env var only.
In multi-tenant or API-key-scoped deployments, worker calls use the wrong key. When the server process has no RUNPOD_API_KEY, worker calls fail with 401 even if the client sent a valid key.
Fix: Either preserve the per-request key propagation, or document that LB now requires RUNPOD_API_KEY set server-side.
Bug 4 (MEDIUM): Container archive path renamed but Docker images not updated
preview.py
CONTAINER_ARCHIVE_PATH = "/root/.flash/artifact.tar.gz" # was /root/.runpod/If Flash worker Docker images still look for /root/.runpod/artifact.tar.gz, flash deploy --preview will silently fail — containers start but find no user code.
Fix: Verify and update Docker image entrypoints in sync with this PR.
7a4cc55 to
48b0653
Compare
48b0653 to
3ba9612
Compare
runpod-Henrik
left a comment
There was a problem hiding this comment.
1. State migration gap — existing deployments lose tracking
The PR description notes "No legacy/migration handling" by design, but the user impact is worth stating explicitly for release planning:
Any user with endpoints deployed under the old SDK will have their state at .runpod/resources.pkl. After upgrading, flash undeploy looks in .flash/resources.pkl and finds nothing — those endpoints become untracked and continue billing. There's no warning when the old directory is detected.
This isn't asking for migration code if it's out of scope, but it should be in the upgrade notes: users should run flash undeploy before upgrading, or manually move .runpod/resources.pkl → .flash/resources.pkl.
2. Question: CONTAINER_ARCHIVE_PATH in preview — cross-service contract
# preview.py:20
CONTAINER_ARCHIVE_PATH = "/root/.flash/artifact.tar.gz"This path is the bind-mount destination inside the container:
f"{archive_path}:{CONTAINER_ARCHIVE_PATH}:ro"If the flash-worker Docker image reads the artifact from the hardcoded path /root/.runpod/artifact.tar.gz, the preview path would silently fail (container starts but can't find its artifact). Has the Docker image been updated to match? If not, flash deploy --preview is broken.
3. .runpod/ no longer excluded from build artifacts
# ignore.py — removed:
".runpod/",During the upgrade window, users may have both .runpod/ (old) and .flash/ (new) directories. With this change, .runpod/ contents get picked up by the scanner and potentially included in build artifacts. Low blast radius (temporary state window), but .runpod/resources.pkl contains pickle-serialized endpoint state — unexpected inclusion in a build is worth noting.
4. Testing — clean
The test_exclude_runpod_directory removal is matched by a new test_exclude_flash_directory — good. isolate_resource_state_file fixture correctly redirects state to .flash/ in tests.
Verdict
PASS with one question to resolve before merge (item 2 — Docker image contract) and one release note item (item 1 — upgrade path). The rename itself is clean and consistent across all touched files.
3ba9612 to
5c54f92
Compare
Rename all references from .runpod to .flash across the codebase: - Update CLI commands, docs, and skeleton templates - Update .gitignore and .flashignore patterns - Update ResourceManager config directory path - Update test fixtures and conftest helpers - Remove obsolete scanner tests for deleted .runpod patterns
5c54f92 to
e34fac9
Compare
Summary
.flash/directory, eliminating the split.runpod/vs.flash/paradigmresources.pkl), container archive path, ignore patterns, skeleton templates, tests, and docsCloses AE-2257
Changed files
Source (4 files):
resource_manager.py,run.py,preview.py,ignore.pyConfig/templates (3 files):
.gitignore,skeleton_template/.gitignore,skeleton_template/.flashignoreTests (3 files):
conftest.py,test_resource_manager.py,test_scanner.pyDocs (4 files):
Flash_Deploy_Guide.md,flash-run.md,flash-undeploy.md,undeploy.pydocstringTest plan
make quality-checkpasses (format, lint, typecheck, tests, coverage).runpoddirectory references in codebase (verified via grep)