Skip to content

Hierarchical Configuration Environments #3592

@sbrugman

Description

@sbrugman

Description

Imagine using Kedro for running pipelines both locally and remote, with mock and warehouse data.
Development takes place on mock data (shared between the environments), but computational configuration differs (e.g. spark local vs cluster mode). On the warehouse data, again the data definitions are shared, but some properties vary, e.g. database names.

The two shared configurations above make that the configuration could be far more concise if we would be able to inherit from multiple environments:

graph TD;
    base["Base (base)"];
    dev["Dev"];
    local["Local (default)"];
    remote["Remote"];
    cluster["Data Warehouse"];
    acc["Acceptance"];
    prd["Production"];
    
    base --> dev --> local;
    dev --> remote;
    base --> cluster --> acc;
    cluster --> prd;
Loading

The bottom four environments are used by the Kedro user.
Alternatively, we could reduce the number of environments:

graph TD;
    base["Base (base)"];
    dev["Dev + Remote"];
    local["Local (default)"];
    cluster["Data Warehouse + Acceptance"];
    prd["Production"];
    
    base --> dev --> local;
    base --> cluster --> prd;
Loading

This relies on the local environment overwriting the configuration set for remote in the combined dev + remote env.

Is this something that could be supported?

Context

As mentioned above, this would simplify the config and avoid duplication of entries.

Possible Implementation

Configuration the base environment takes place in settings.py. The base_env argument could be simply accepting a dict as well as a string to specify the base per environment.

Possible Alternatives

Something might be achieved through the advanced features of the OmegaConfigLoader that I am not aware of.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue: Feature RequestNew feature or improvement to existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions