-
Notifications
You must be signed in to change notification settings - Fork 1k
Labels
Issue: Feature RequestNew feature or improvement to existing featureNew feature or improvement to existing feature
Description
Description
Implement the ProjectSnapshot dataclass and _build_project_snapshot() orchestrator that assembles all sub-snapshots into a single complete view of the project. This is the internal composition layer consumed by get_project_snapshot() in #5432.
Context
The following builders from prior PRs feed into this orchestrator:
_build_project_metadata_snapshot(metadata)→ ProjectMetadataSnapshot (ProjectMetadataSnapshot: model, builder, tests #5428)_build_pipeline_snapshots()→ list[PipelineSnapshot] (NodeSnapshot+PipelineSnapshot- models, builder, tests #5429)_build_dataset_snapshots(config_loader)→ dict[str, DatasetSnapshot] (DatasetSnapshot- model, builder, tests #5430)
Two functions deferred from earlier PRs as premature now belong here:
_make_config_loader(project_path, env)- removed fromDatasetSnapshot- model, builder, tests #5430; lives here as the orchestrator creates the loader once and passes it to both _build_dataset_snapshots and _build_parameter_keys_get_parameter_keys(config_loader)- removed fromDatasetSnapshot- model, builder, tests #5430; parameters are a list[str] of keys with no values exposed, a separate concern from DatasetSnapshot
Design decisions:
bootstrap_project()not _get_project_metadata() - bootstrap_project() additionally callsconfigure_project()which populates the global pipelines mapping required by _build_pipeline_snapshots()- Config loader created once — _make_config_loader uses settings.CONFIG_LOADER_CLASS (not hardcoded OmegaConfigLoader) and settings.CONFIG_LOADER_ARGS; the single instance is passed to both _build_dataset_snapshots and _build_parameter_keys
- _resolve_factory_patterns uses CatalogConfigResolver.resolve_pattern() only — this resolves template variables ({namespace}, {name}) for pipeline-referenced datasets that match a factory pattern but have no explicit catalog entry. It does NOT use credential resolution, so it is safe in a credentials-free context. Input datasets dict is not mutated.
- Parameters are list[str] — keys only ("parameters", "params:model_options", "params:model_options.test_size"), never values. Nested keys are recursively flattened and sorted.
Scope & Deliverables:
kedro/inspection/models.py- AddProjectSnapshot:
@dataclass
class ProjectSnapshot:
metadata: ProjectMetadataSnapshot
pipelines: list[PipelineSnapshot]
datasets: dict[str, DatasetSnapshot]
parameters: list[str] = field(default_factory=list)kedro/inspection/snapshot.py- Add:
def _build_project_snapshot(project_path: Path, env: str) -> ProjectSnapshot:
# orchestrator: bootstrap_project → config_loader → all builders → ProjectSnapshotkedro/inspection/helper.py- Add:
Move any helpers that are not snapshot builders here
def _make_config_loader(project_path: Path, env: str) -> AbstractConfigLoader:
# uses settings.CONFIG_LOADER_CLASS and settings.CONFIG_LOADER_ARGS
def _get_parameter_keys(config_loader: AbstractConfigLoader) -> list[str]:
# sorted list of "parameters" + all "params:..." keys, no values
def _resolve_factory_patterns(
datasets: dict[str, DatasetSnapshot],
conf_catalog: dict[str, Any],
pipelines: list[PipelineSnapshot],
) -> dict[str, DatasetSnapshot]:
# enriches datasets with pipeline-referenced names that match factory patterns
# uses CatalogConfigResolver.resolve_pattern() only, no credential resolution
# does not mutate input datasets dict- tests/inspection/test_project_snapshot.py
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Issue: Feature RequestNew feature or improvement to existing featureNew feature or improvement to existing feature
Type
Projects
Status
In Progress