Skip to content

ProjectSnapshot - model, orchestrator, tests #5431

@ravi-kumar-pilla

Description

@ravi-kumar-pilla

Description

Implement the ProjectSnapshot dataclass and _build_project_snapshot() orchestrator that assembles all sub-snapshots into a single complete view of the project. This is the internal composition layer consumed by get_project_snapshot() in #5432.

Context

The following builders from prior PRs feed into this orchestrator:

Two functions deferred from earlier PRs as premature now belong here:

Design decisions:

  • bootstrap_project() not _get_project_metadata() - bootstrap_project() additionally calls configure_project() which populates the global pipelines mapping required by _build_pipeline_snapshots()
  • Config loader created once — _make_config_loader uses settings.CONFIG_LOADER_CLASS (not hardcoded OmegaConfigLoader) and settings.CONFIG_LOADER_ARGS; the single instance is passed to both _build_dataset_snapshots and _build_parameter_keys
  • _resolve_factory_patterns uses CatalogConfigResolver.resolve_pattern() only — this resolves template variables ({namespace}, {name}) for pipeline-referenced datasets that match a factory pattern but have no explicit catalog entry. It does NOT use credential resolution, so it is safe in a credentials-free context. Input datasets dict is not mutated.
  • Parameters are list[str] — keys only ("parameters", "params:model_options", "params:model_options.test_size"), never values. Nested keys are recursively flattened and sorted.

Scope & Deliverables:

  • kedro/inspection/models.py - Add ProjectSnapshot:
@dataclass
class ProjectSnapshot:
    metadata: ProjectMetadataSnapshot
    pipelines: list[PipelineSnapshot]
    datasets: dict[str, DatasetSnapshot]
    parameters: list[str] = field(default_factory=list)
  • kedro/inspection/snapshot.py - Add:
def _build_project_snapshot(project_path: Path, env: str) -> ProjectSnapshot:
    # orchestrator: bootstrap_project → config_loader → all builders → ProjectSnapshot
  • kedro/inspection/helper.py - Add:

Move any helpers that are not snapshot builders here

def _make_config_loader(project_path: Path, env: str) -> AbstractConfigLoader:
    # uses settings.CONFIG_LOADER_CLASS and settings.CONFIG_LOADER_ARGS

def _get_parameter_keys(config_loader: AbstractConfigLoader) -> list[str]:
    # sorted list of "parameters" + all "params:..." keys, no values

def _resolve_factory_patterns(
    datasets: dict[str, DatasetSnapshot],
    conf_catalog: dict[str, Any],
    pipelines: list[PipelineSnapshot],
) -> dict[str, DatasetSnapshot]:
    # enriches datasets with pipeline-referenced names that match factory patterns
    # uses CatalogConfigResolver.resolve_pattern() only, no credential resolution
    # does not mutate input datasets dict
  • tests/inspection/test_project_snapshot.py

Metadata

Metadata

Labels

Issue: Feature RequestNew feature or improvement to existing feature

Type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions