Fix streamlit deploy failure when pages/*.py overlaps with pages/ directory#2780
Fix streamlit deploy failure when pages/*.py overlaps with pages/ directory#2780sfc-gh-moczko wants to merge 3 commits intomainfrom
Conversation
5ee660a to
26c478a
Compare
35039f3 to
a4780e7
Compare
| if stage_path.is_user_stage(): | ||
| path = StagePath.get_user_stage() / file["name"] | ||
| else: | ||
| elif stage_path.is_git_repo(): |
|
Please, do not remove Pre-review checklist
|
| type="streamlit", | ||
| identifier="test_streamlit", | ||
| main_file="streamlit_app.py", | ||
| artifacts=["streamlit_app.py", "pages/", "pages/*.py"], |
There was a problem hiding this comment.
I've done some testing and this fails with:
│ Multiple file or directories were mapped to one output destination. │
│
│ destination = pages
if the order is swapped to ["streamlit_app.py", "pages/*.py", "pages/"]. I think we should make the behaviour consistent so that the order of artifacts does not matter
There was a problem hiding this comment.
Fixed — the directory-to-directory branch in _ArtifactPathMap.put() now handles the case where individual files from a glob were registered first. Instead of immediately raising TooManyFilesError, it checks whether all children of the new directory source are consistent with existing mappings and skips the duplicate if they match. Added test_bundle_map_deduplicates_glob_and_directory_overlap and test_bundle_deduplicates_pages_glob_and_directory to cover both orderings.
There was a problem hiding this comment.
@sfc-gh-moczko Hm, I don't see any changes compared to the last time I reviewed and these two new tests are missing - did you push the changes?
There was a problem hiding this comment.
@sfc-gh-kolszewski Good catch! I've just pushed the changes (commit adbcc79). The implementation is now complete with both tests you mentioned:
Tests added:
-
test_bundle_map_deduplicates_glob_and_directory_overlap(tests/api/artifacts/test_bundle_map.py:484)- Unit test for glob-first ordering (file added before directory)
-
test_bundle_deduplicates_pages_glob_and_directory(tests/streamlit/test_streamlit_entity.py:102)- End-to-end test with
artifacts=["streamlit_app.py", "pages/*.py", "pages/"] - Tests the exact scenario you reported
- End-to-end test with
Fix details:
The issue was in _ArtifactPathMap.put() (bundle_map.py:447-465). When individual files are added via glob before
their parent directory, _update_dest_is_dir() recursively marks the parent as a directory. The old code immediately
raised TooManyFilesError when the directory was later added.
The fix walks the directory to verify all children either:
- Are not yet mapped, OR
- Map from the same child source
If consistent, it skips the duplicate via early return. If any child maps from a different source, it raises
TooManyFilesError as before.
Result: Both orderings now work correctly:
["pages/", "pages/*.py"]— directory-first ✓ (existing code path)["pages/*.py", "pages/"]— glob-first ✓ (new code path)
Let me know if you'd like me to clarify anything!
|
Done — added the pre-review checklist to the PR description. |
a4780e7 to
27e803a
Compare
adbcc79 to
f799315
Compare
…ectory (#2741) When both `pages/` and `pages/*.py` appear in artifacts, the bundle map created duplicate destination mappings that caused `NotInDeployRootError` during symlink resolution. Fix by tracking directory-walked children in `__dest_to_src` to detect and deduplicate overlapping mappings, and by filtering redundant `additional_source_files` during V1-to-V2 definition conversion. .... Generated with [Cortex Code](https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code) Co-Authored-By: Cortex Code <noreply@snowflake.com>
When individual files are added via glob (e.g., pages/*.py) before their parent directory (pages/), the directory destination is already marked in _dest_is_dir due to recursive parent marking in _update_dest_is_dir(). The previous code immediately raised TooManyFilesError, even though the mappings were from the same source and thus not a conflict. Changes: - _ArtifactPathMap.put(): Replace immediate error with consistency check when current_is_dir=True. Walk the directory to verify all children either (a) are not yet mapped, or (b) map from the same child source. If consistent, skip duplicate via early return. If any child maps from a different source, raise TooManyFilesError. - test_bundle_map_deduplicates_glob_and_directory_overlap: Unit test for glob-first ordering (adds file, then directory). - test_bundle_deduplicates_pages_glob_and_directory: End-to-end test with artifacts=["streamlit_app.py", "pages/*.py", "pages/"] to verify StreamlitEntity.bundle() handles both orderings. Addresses reviewer feedback: artifact order is now consistent. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
f799315 to
cec4b50
Compare
…appings The previous fix walked the new source directory to detect conflicts, but missed cases where existing dest files came from sources outside the new source directory. Reverse the check: iterate existing dest mappings and verify each is consistent with what the new source would produce. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sfc-gh-jwilkowski
left a comment
There was a problem hiding this comment.
Thanks for your contribution! Please address one issue with local imports and add a note to RELEASE_NOTES.md about this change
| from snowflake.cli.api.project.definition_conversion import ( | ||
| convert_streamlit_to_v2_data, | ||
| ) | ||
| from snowflake.cli.api.project.schemas.v1.streamlit.streamlit import Streamlit |
There was a problem hiding this comment.
Are those local imports needed? Please move them to the top of the file if not
Summary
Fixes #2741.
snow streamlit deployfails withNotInDeployRootErrorwhen bothpages/(directory) andpages/*.py(glob) appear in artifacts and apages/directory exists on disk.Root cause
_ArtifactPathMap.put()walks directory sources to mark children in_dest_is_dir, but does not register those children in__dest_to_src. When a glob likepages/*.pylater resolves to the same files,__dest_to_src.get(dest)returnsNoneinstead of the existing source, so the duplicate mapping is silently added. Duringsymlink_or_copy(), the second mapping callsPath.resolve()on an already-created symlink, which follows it back to the project source directory -- outside the deploy root -- and raisesNotInDeployRootError.Fix
_ArtifactPathMap.put()(bundle_map.py) -- Three changes in theput()method:__dest_to_src: Each file discovered duringos.walkis recorded as__dest_to_src[child_dest] = child_src. If a different source already maps to that destination,TooManyFilesErroris raised immediately.current_source == src, the mapping is redundant (already covered by a parent directory walk), soput()returns without adding a duplicate entry.pages/*.py) are added first and then the parent directory (pages/) is added, the directory destination is already marked in_dest_is_dir. Instead of raisingTooManyFilesError, we now verify all children of the new directory source are consistent with existing mappings and skip the duplicate if they are.convert_streamlit_to_v2_data()(definition_conversion.py) -- Defense-in-depth:Adds
_is_path_covered_by_directory()helper. During V1-to-V2 definition conversion,additional_source_filesentries likepages/*.pyare skipped if they fall under a directory already included as an artifact (e.g.pages/). This prevents the overlapping mappings from reaching_ArtifactPathMapin the first place.Pre-review checklist
Test plan
test_bundle_map_deduplicates_directory_and_glob_overlap-- directory first, then explicit file; verifies duplicate is silently deduplicatedtest_bundle_map_deduplicates_glob_and_directory_overlap-- glob/files first, then directory; verifies reverse order also deduplicatestest_bundle_map_disallows_different_source_collision_with_directory_child-- different source, same dest; verifies TooManyFilesError is raisedtest_bundle_deduplicates_pages_directory_and_glob-- end-to-end StreamlitEntity.bundle() with artifacts in dir-then-glob ordertest_bundle_deduplicates_pages_glob_and_directory-- end-to-end with reversed glob-then-dir ordertest_v1_to_v2_streamlit_conversion_deduplicates_pages-- V1-to-V2 conversion filters overlapping pages/*.pytest_v1_to_v2_streamlit_conversion_keeps_non_overlapping_additional_files-- non-overlapping files preserved