Skip to content

Feature: SimulationRegistry, generic executors, and parameter_schema() #106

@profsergiocosta

Description

@profsergiocosta

Feature: SimulationRegistry, generic executors, and parameter_schema()

Summary

The DisSModel Streamlit explorers (ca_all.py, run_all_sysdyn.py) demonstrate a powerful pattern: discover all concrete subclasses of CellularAutomaton or Model in a package, auto-generate a parameter form from annotated attributes via display_inputs, and run any model from a single interface with zero per-model boilerplate.

This issue ports that pattern into the dissmodel core so it is available both locally (Streamlit, Jupyter, CLI) and on the platform worker. Three things need to land together:

  1. SimulationRegistry — an in-memory index of concrete model classes, populated automatically via __init_subclass__ when a package is imported.
  2. Generic executorsGenericCAExecutor and GenericSysDynExecutor, which adapt any registered model class to the ModelExecutor contract without requiring a dedicated executor subclass per model.
  3. parameter_schema() — a utility function that extracts annotated attributes from any registered class as a plain dict, serving as the programmatic equivalent of display_inputs.

Note: SimulationRegistry is an in-memory, process-lifetime construct used by the worker during job execution. It is distinct from the TOML-based model catalogue in dissmodel-configs (the editorial source of truth) and from ExecutorRegistry (which indexes ModelExecutor subclasses). Each registry has orthogonal responsibilities:

Registry What it indexes Where it lives Lifetime
TOML model catalogue Executable configurations + schemas dissmodel-configs (git) Persistent
ExecutorRegistry ModelExecutor subclasses dissmodel.executor Process lifetime
SimulationRegistry CellularAutomaton / Model subclasses dissmodel.executor Process lifetime

1. SimulationRegistry

A new class in dissmodel/executor/simulation_registry.py. It is populated at runtime in the worker subprocess when _import_executor_package imports a package and __init_subclass__ fires for each concrete model class found.

# dissmodel/executor/simulation_registry.py

class SimulationRegistry:
"""
In-memory index of concrete CellularAutomaton and system dynamics Model
subclasses. Populated automatically via init_subclass when a package
is imported — no manual registration required.

Used exclusively by the platform worker (job_runner) and the CLI to
resolve model classes by name at execution time. The API and Streamlit
service never interact with this registry directly — they read model
metadata from the TOML catalogue instead.
"""

_ca:     dict[str, type] = {}
_sysdyn: dict[str, type] = {}

@classmethod
def register_ca(cls, model_cls: type) -> None:
    cls._ca[model_cls.__name__] = model_cls

@classmethod
def register_sysdyn(cls, model_cls: type) -> None:
    cls._sysdyn[model_cls.__name__] = model_cls

@classmethod
def get_ca(cls, name: str) -> type:
    if name not in cls._ca:
        raise KeyError(
            f"CA model '{name}' not registered. "
            f"Available: {sorted(cls._ca)}"
        )
    return cls._ca[name]

@classmethod
def get_sysdyn(cls, name: str) -> type:
    if name not in cls._sysdyn:
        raise KeyError(
            f"SysDyn model '{name}' not registered. "
            f"Available: {sorted(cls._sysdyn)}"
        )
    return cls._sysdyn[name]

@classmethod
def list_ca(cls) -> list[str]:
    return sorted(cls._ca)

@classmethod
def list_sysdyn(cls) -> list[str]:
    return sorted(cls._sysdyn)

Registration is triggered automatically in the base classes via lazy local imports to avoid circular dependencies at module load time:

# dissmodel/geo/_ca.py

class CellularAutomaton(ABC):
def init_subclass(cls, **kwargs):
super().init_subclass(**kwargs)
import inspect
if not inspect.isabstract(cls):
from dissmodel.executor.simulation_registry import SimulationRegistry
SimulationRegistry.register_ca(cls)

dissmodel/core/_model.py

class Model(ABC):
def init_subclass(cls, **kwargs):
super().init_subclass(**kwargs)
import inspect
if not inspect.isabstract(cls):
from dissmodel.executor.simulation_registry import SimulationRegistry
SimulationRegistry.register_sysdyn(cls)

Convention for external packages. A package whose models should be discoverable via SimulationRegistry must import its model classes in __init__.py so that __init_subclass__ fires when the package is imported by _import_executor_package:

# my_ca_models/__init__.py
from my_ca_models.models import ForestFireModel, ConwayVariant

This convention must be documented in the contributor guide.


2. Generic executors

Two new concrete ModelExecutor subclasses in dissmodel/executor/generic.py. They receive the target model class name via record.parameters["model_class"] and resolve it from SimulationRegistry at validate() time — before any data is loaded — so misconfigured jobs fail fast with a clear error.

GenericCAExecutor

class GenericCAExecutor(ModelExecutor):
    """
    Platform adapter for any CellularAutomaton subclass registered in
    SimulationRegistry. No per-model executor class required.
The model class is resolved from SimulationRegistry at validate() time,
which means the package containing the model must have been imported
(via _import_executor_package in job_runner) before validate() is called.

Required parameters (in ExperimentRecord.parameters):
    model_class  (str):  name of the CellularAutomaton subclass
    steps        (int):  simulation length, default 50
    grid_size    (int):  NxN grid side length, default 20
    model_params (dict): annotated attributes to inject before initialize()

Optional parameters:
    resolution         (float): cell size in map units, default 1.0
    initial_state_attr (str):   grid attribute name, default "state"
"""

name = "generic_ca"

def validate(self, record: ExperimentRecord) -> None:
    if "model_class" not in record.parameters:
        raise ValueError(
            "parameters.model_class is required for generic_ca. "
            "Pass the name of a registered CellularAutomaton subclass."
        )
    from dissmodel.executor.simulation_registry import SimulationRegistry
    from dissmodel.executor.schema_utils        import parameter_schema
    cls     = SimulationRegistry.get_ca(record.parameters["model_class"])
    schema  = parameter_schema(cls)
    unknown = set(record.parameters.get("model_params", {})) - set(schema)
    if unknown:
        raise ValueError(
            f"model_params contains unknown attributes for "
            f"{record.parameters['model_class']}: {unknown}. "
            f"Known attributes: {sorted(schema)}"
        )

def load(self, record: ExperimentRecord):
    from dissmodel.geo import vector_grid
    params    = record.parameters
    grid_size = params.get("grid_size", 20)
    gdf = vector_grid(
        dimension=(grid_size, grid_size),
        resolution=params.get("resolution", 1.0),
        attrs={params.get("initial_state_attr", "state"): 0},
    )
    record.add_log(f"Grid created: {grid_size}×{grid_size}")
    return gdf

def run(self, data, record: ExperimentRecord):
    from dissmodel.core import Environment
    from dissmodel.executor.simulation_registry import SimulationRegistry

    params     = record.parameters
    gdf        = data
    grid_size  = params.get("grid_size", 20)
    steps      = params.get("steps", 50)
    ModelClass = SimulationRegistry.get_ca(params["model_class"])

    env   = Environment(start_time=0, end_time=steps)
    model = ModelClass(gdf=gdf, dim=grid_size, start_time=0, end_time=steps)

    for attr, value in params.get("model_params", {}).items():
        setattr(model, attr, value)

    model.initialize()
    record.add_log(f"Running {params['model_class']} for {steps} steps...")
    env.run()
    record.add_log("Simulation complete")
    return gdf

def save(self, result, record: ExperimentRecord) -> ExperimentRecord:
    # Serialize GeoDataFrame via dissmodel.io (GeoJSON or GeoParquet)
    ...

GenericSysDynExecutor follows the same structure with load returning None.


3. parameter_schema() — programmatic display_inputs

A utility function in dissmodel/executor/schema_utils.py that extracts annotated attributes from a class as a plain dict. This is the data layer behind both the local Streamlit explorers and the TOML [schema.model_params] block that the platform API serves.

# dissmodel/executor/schema_utils.py

import inspect
from typing import Any

def parameter_schema(cls: type) -> dict[str, dict[str, Any]]:
"""
Extract annotated attributes from a class as a parameter schema dict.

Returns a mapping of attribute name → {type, default}. Private
attributes (leading underscore) are excluded.

This is the programmatic equivalent of display_inputs() for local use
(Streamlit, Jupyter, CLI). On the platform, the same information is
declared statically in the TOML [schema.model_params] block so the API
can serve it without importing the model package.

Example
-------
>>> class MyModel(Model):
...     birth_rate: float = 0.03
...     death_rate: float = 0.01
>>> parameter_schema(MyModel)
{'birth_rate': {'type': 'float', 'default': 0.03},
 'death_rate': {'type': 'float', 'default': 0.01}}
"""
hints = {
    k: v
    for k, v in inspect.get_annotations(cls, eval_str=True).items()
    if not k.startswith("_")
}
return {
    name: {
        "type":    typ.__name__ if hasattr(typ, "__name__") else str(typ),
        "default": getattr(cls, name, None),
    }
    for name, typ in hints.items()
}

The local Streamlit explorers can replace inspect.getmembers + display_inputs with SimulationRegistry + parameter_schema:

# Before
model_classes = {
    name: cls
    for name, cls in inspect.getmembers(ca_models, inspect.isclass)
    if issubclass(cls, CellularAutomaton) and not inspect.isabstract(cls)
}

After — works with any imported package, not just a fixed local module

from dissmodel.executor.simulation_registry import SimulationRegistry
from dissmodel.executor.schema_utils import parameter_schema

model_name = st.sidebar.selectbox("Model", SimulationRegistry.list_ca())
schema = parameter_schema(SimulationRegistry.get_ca(model_name))

render widgets from schema...


How a contributor ships a new model

A contributor only writes the science — no executor class required:

# my_ca_models/models.py

from dissmodel.geo import CellularAutomaton

class ForestFireModel(CellularAutomaton):
ignition_prob: float = 0.001
spread_prob: float = 0.3

def initialize(self): ...
def rule(self, cell, neighbours): ...

# my_ca_models/__init__.py  ← required convention
from my_ca_models.models import ForestFireModel

Then a PR to dissmodel-configs with the TOML entry, including the schema block declared manually (see platform issue for TOML format):

[model]
class   = "GenericCAExecutor"
package = "my-ca-models>=1.0.0"

[parameters]
model_class = "ForestFireModel"
grid_size = 40
steps = 100

[schema.model_params]
ignition_prob = { type = "float", default = 0.001 }
spread_prob = { type = "float", default = 0.3 }

The [schema.model_params] block is the static declaration that the platform API serves to clients. It must be kept in sync with the annotated attributes in the Python class — this is the contributor's responsibility and is enforced via PR review in dissmodel-configs.


Checklist

  • Create dissmodel/executor/simulation_registry.py with SimulationRegistry
  • Add __init_subclass__ hook in CellularAutomaton to auto-register
  • Add __init_subclass__ hook in system dynamics Model base to auto-register
  • Create dissmodel/executor/schema_utils.py with parameter_schema()
  • Create dissmodel/executor/generic.py with GenericCAExecutor and GenericSysDynExecutor
  • Implement save() in both generic executors using dissmodel.io conventions
  • Export SimulationRegistry, parameter_schema, GenericCAExecutor, GenericSysDynExecutor from dissmodel.executor
  • Add unit tests:
    • SimulationRegistry is populated when a concrete subclass is defined
    • get_ca raises KeyError for unknown names
    • parameter_schema returns correct types and defaults
    • validate() raises ValueError for unknown model_params keys
    • GenericCAExecutor runs end-to-end with a fixture model via ExecutorTestHarness
  • Update Streamlit explorer examples to use SimulationRegistry + parameter_schema
  • Document the __init__.py export convention for external packages in the contributor guide
  • Document both authoring paths in the contributor guide:
    • Full ModelExecutor subclass — for models with complex I/O (e.g. CoastalRasterExecutor)
    • Plain CellularAutomaton/Model subclass + generic executor — for simulation-only models

Labels

feature executor simulation-registry

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions