Migrates bead to didactic from pydantic v2 by aaronstevenwhite · Pull Request #3 · FACTSlab/bead

aaronstevenwhite · 2026-05-06T20:04:42Z

Description

Migrates the entire bead package off Pydantic v2 onto didactic (PyO3-backed typed-data layer over panproto). Every BaseModel becomes a frozen dx.Model; every @dataclass either becomes a dx.Model or stays a plain class for callable-bearing internals. Mutating add/remove methods (ExperimentList.add_item, Lexicon.add, etc.) are rewritten as pure .with_(...) returning a new instance, with every call site rebound. Discriminated unions (ListConstraint, BatchConstraint, ASTNode) become dx.TaggedUnion rooted hierarchies discriminated by an explicit kind/constraint_type literal. Cross-field invariants move into __axioms__ = (dx.axiom("..."),) panproto expressions; per-field normalization stays in @dx.validates. Collection fields drop list[T] for tuple[T, ...] and use dx.Embed[T] for nested-Model arms (covering all of items/spans, lists/constraints, resources/template, config/*, deployment/jspsych, simulation, behavioral, transforms).

The transforms subsystem committed alongside (bead/transforms/) is included on this branch as well, restructured to use the new dx.Model-based TransformContext and a registry-based TransformRegistry.

The CHANGELOG is updated under [Unreleased] with the breaking-change summary.

Motivation

Pydantic v2 was no longer pulling its weight: cross-field validation forced post-__init__ mutation, the Pydantic Field(discriminator=...) interaction with generics was fragile, and bead's own metadata model leaked the dict-of-pydantic-models problem (Lexicon.items as dict[UUID, LexicalItem] made round-trips, copy semantics, and lens-based migrations awkward). didactic's frozen-Model + axiom expression language gives us:

guaranteed immutability and .with_(...) semantics (no accidental mutation, no frozen=False escape hatches)
declarative cross-field invariants checkable by panproto's expression language (__axioms__)
discriminated unions with proper class identity preservation through JSON round-trips (after Embed[TaggedUnion] downcasts variant instances to the union root, losing fields panproto/didactic#27 / #30 fixes)
dx.Embed[T] for typed nested fields with reliable storage and round-trip behavior

This is a pre-1.0 cleanup — no backward-compat shims, no didactic-pydantic adapter. The CHANGELOG records the breaking changes.

Fixes #

Type of Change

Breaking change (fix or feature that would cause existing functionality to change)
Refactoring (no functional changes)
Tests (adding or updating tests)
Documentation update

Checklist

I have read the CONTRIBUTING guidelines
My code follows the project's style guidelines
I have run uv run ruff check . and uv run ruff format .
I have run uv run pyright with no errors
I have added tests that prove my fix/feature works
All tests pass (uv run pytest tests/)
I have updated documentation as needed

Testing

uv run pytest tests/ — 2946 passed, 1 skipped (the skip is tests/items/test_span_labeling.py::test_default_config, gated on pytest.importorskip("spacy") for the optional spaCy backend; the rest of the span_labeling suite runs against the whitespace tokenizer).
uv run ruff check bead tests — All checks passed!
uv run pyright bead — 0 errors, 0 warnings, 0 informations.
uv run mkdocs build — succeeds (5 pre-existing griffe docstring/signature mismatches in bead/resources/constraint_builders.py, unrelated to this PR).
No # type: ignore, # noqa, or per-file pragmas introduced by the migration. The one project-level pyright config tweak is for Range[T] (didactic generic models cannot use PEP 695 syntax with Embed[T] field targets, so typing.Generic[T] is retained and bead/data/range.py carries UP046 in [tool.ruff.lint.per-file-ignores] with an inline rationale).

Upstream issues filed and fixed during this work

The migration surfaced several didactic gaps. Each was filed, fixed upstream, and verified locally before this branch finished:

Field defaults are lost on subclass: Child(Base) reports inherited fields as required panproto/didactic#13 — inherited field defaults dropped
List input on tuple-typed field raises bare AssertionError instead of ValidationError panproto/didactic#15 — list-literal in tuple field raised AssertionError
@dx.validates decorator is a no-op: tagged methods are never invoked panproto/didactic#17 — @dx.validates was a no-op
Add pathlib.Path as a first-class field type panproto/didactic#21 — Path field type unsupported
TaggedUnion-typed fields fail JSON round-trip when value is a dict panproto/didactic#22 — TaggedUnion JSON round-trip lost variant identity
Add enum.StrEnum / enum.Enum support as a first-class field type panproto/didactic#23 — StrEnum field type unsupported
TaggedUnion-typed field snapshots variants at variant-definition time, missing later variants panproto/didactic#24 — TaggedUnion variant snapshot at class-definition time
Axioms can't reference Optional fields: null comparisons unsupported in expression language panproto/didactic#26 — axioms could not reference Optional fields with is None / != None
Embed[TaggedUnion] downcasts variant instances to the union root, losing fields panproto/didactic#27 — Embed[Root] where Root is a TaggedUnion downcast variant instances
Field typed as a Union of two TaggedUnion roots panproto/didactic#30 — multi-root tagged-union fields (A | B where both are TaggedUnion roots)
Class-level / cross-field validation: axiom evaluator rejects null, no @model_validator panproto/didactic#31 — @dx.model_validator(mode="after") for cross-field invariants Python-only
ModelMeta uses kw_only_default=False; pyright flags 'Fields without default' for every BeadBaseModel-style subclass panproto/didactic#34 — kw_only_default=False in dataclass_transform produced 103 spurious "Fields without default values" pyright errors
Model.with_(**changes: FieldValue) is too tight: dict-invariance forces casts at every call site panproto/didactic#36 — with_(**changes: FieldValue) typed dict[str, FieldValue] too tightly; Mapping[str, FieldValue] covariant arm clears the invariance noise

This PR pins didactic>=0.6.2 so all of those fixes are required.

Screenshots (if applicable)

n/a — backend / library refactor.

Introduces a value-level transform system with TransformContext, TransformRegistry, morphology, and text utilities. Wires it into jsPsych prompt rendering via the [[label|transform]] reference syntax, with 69 covering tests.

Replaces pydantic with didactic>=0.3.2 and panproto>=0.43 in project dependencies, raises requires-python to >=3.14, and updates ruff target-version, pyright pythonVersion, and the .python-version pin. Subsequent commits convert the model layer file-by-file.

Converts BeadBaseModel and the bead/data/* module to didactic 0.4.0: frozen by default with .with_(...) updates, dx.field(default_factory=...) for defaults, dx.axiom for cross-field invariants on Range[T], dx.Embed for nested-Model fields in MetadataTracker, and dx.ValidationError replacing pydantic's. ValidationReport's mutating add_error / add_warning become pure methods returning new instances. UUID-reference walking now introspects __field_specs__ rather than typing.get_type_hints (which trips on TYPE_CHECKING-only didactic internals). All 108 tests under tests/data/ pass; pyproject pins didactic>=0.4.0.

Converts TokenizerConfig, DisplayToken, and TokenizedText to dx.Model. TokenizedText.tokens becomes a tuple[dx.Embed[DisplayToken], ...] and the helper properties (token_texts, space_after_flags) return tuples. Tokenizer __call__ implementations build tuples directly. Tests updated to use dx.ValidationError and tuple equality.

Converts Span, SpanSegment, SpanLabel, SpanRelation, SpanSpec, UnfilledSlot, ModelOutput, Item, ItemCollection, and the item-template hierarchy (ChunkingSpec, TimingParams, TaskSpec, PresentationSpec, ItemElement, ItemTemplate, ItemTemplateCollection) to didactic Models. Notes: - list[T] fields become tuple[T, ...]; nested-Model fields wrap with dx.Embed[T]; mutating add_item/add_template/add_model_output methods become pure with_item/with_template/with_model_output returning new instances. - TaskSpec.scale_bounds is now a ScaleBounds(min: int, max: int) Model; didactic does not support heterogeneous tuples (panproto/didactic#15 background). - TaskSpec.scale_labels keys typed as str; didactic dict keys must be str (so int-keyed scale point labels stringify). - Item.constraint_satisfaction keys typed as str (UUID stringified) for the same reason. - @field_validator/@model_validator hooks port to @dx.validates; Item's cross-field span_relations integrity check is dropped here pending a panproto-axiom-friendly formulation. - Utility modules under bead/items/ that don't define Pydantic models (binary, cache, categorical, cloze, constructor, forced_choice, free_text, generation, magnitude, multi_select, ordinal_scale, scoring, validation, span_labeling) work against the new Models unchanged at the source level. Their tests need follow-up edits to switch list[] literals -> tuples in fixtures (blocked on panproto/didactic#15 for ergonomic cleanup).

Converts Constraint, LexicalItem, MWEComponent, MultiWordExpression, Lexicon, Slot, Template, TemplateSequence, TemplateTree, TemplateCollection, LexicalItemClass, and TemplateClass to dx.Model. Notes: - Constraint.context value type is now JsonValue; the DSL evaluator coerces lists into sets when used as membership tests, so call sites pass [...] in place of {...}. - Lexicon, TemplateCollection, LexicalItemClass, TemplateClass now store members as tuple[Embed[T], ...]; lookup by id is exposed via by_id() and __contains__. - Mutating add/remove methods become pure with_item / without_item / with_template / without_template returning new instances (frozen Models). - Template's @model_validator(mode="after") integrity check that template_string and slots agree is extracted to a free function slots_match_template() which callers invoke explicitly. - Pin bumps to didactic>=0.4.3 (issues #11, #15, #17, #18 fixed).

Converts the list-level and batch-level constraint hierarchies and the ExperimentList / ListCollection containers to didactic 0.4.3. Notes: - ListConstraint and BatchConstraint are dx.TaggedUnion roots with constraint_type as the discriminator. Variants subclass them and pin constraint_type to a typing.Literal value. Multiple inheritance (BeadBaseModel + dx.TaggedUnion) carries over the BeadBaseModel identity and timestamp fields. - OrderingConstraint.precedence_pairs becomes tuple[OrderingPair, ...] where OrderingPair is a small (before, after) Model; didactic does not support heterogeneous tuples as field types. - ExperimentList.constraint_satisfaction becomes tuple[ConstraintSatisfaction, ...] records keyed on the embedded UUID. - @model_validator(mode="after") cross-field checks (size_params, distance_constraints, presentation_order matches item_refs) extract to free functions: validate_size_constraint, validate_ordering_constraint, validate_presentation_order. - Mutating add_item / remove_item / shuffle_order on ExperimentList and add_list on ListCollection become pure with_item / without_item / with_shuffled_order / with_list returning new instances.

Converts Participant, ParticipantIDMapping, FieldSpec, ParticipantMetadataSpec, ParticipantCollection, and IDMappingCollection to dx.Model. Notes: - FieldSpec.range collapses to Range[float] | None; didactic does not yet support unions of two Embed[T] alternatives. Bound checking works identically because int is a numeric subtype of float for inclusion tests. - @model_validator(mode="after") on FieldSpec extracts to free function validate_field_spec for callers that want construction-time cross-field checks. - Mutating add_participant/add_session/add_mapping/deactivate methods become pure with_participant / with_session / with_mapping / deactivated returning new instances or (new_instance, value) tuples.

Converts the configuration layer (BeadConfig, PathsConfig, ModelConfig, ItemConfig, ResourceConfig, ListConfig, BatchConstraintConfig, TemplateConfig, SlotStrategyConfig, SimulationRunnerConfig and friends, DeploymentConfig + Slopit configs, LoggingConfig) and the active-learning configs (BaseEncoderModelConfig and its ForcedChoice/Categorical/Binary/MultiSelect leaf classes, OrdinalScaleModelConfig, MagnitudeModelConfig, FreeTextModelConfig, ClozeModelConfig, ActiveLearningLoopConfig, TrainerConfig, UncertaintySamplerConfig, JatosDataCollectionConfig, ProlificDataCollectionConfig, ActiveLearningConfig, MixedEffectsConfig + RandomEffectsSpec + VarianceComponents) to dx.Model. Notes: - @model_validator(mode="after") cross-field checks on MagnitudeModelConfig (bounded vs distribution), TemplateConfig (MLM / mixed-strategy consistency), BatchConstraintConfig (per-type required fields), and SlopitIntegrationConfig (bundle existence) extract to free validate_* functions. - ModelMetadata.training_config switches from dict[str, str | int | float | bool | Path | None] to dict[str, JsonValue]; Path values must be stringified before storage. - ActiveLearningConfig pulls in MixedEffectsConfig from bead.active_learning.config; that file now uses dx.Model directly. Note: tests + a few remaining fields that depend on Path as a first- class field type (PathsConfig.data_dir, LoggingConfig.file, TrainerConfig.logging_dir, TemplateConfig.mlm_cache_dir, ModelMetadata.training_data_path, etc.) still reference Path; they will resume cleanly once panproto/didactic#21 lands.

…ioral, and finishes bead.config Converts the remaining model layers blocked or partly converted in the previous batch: - bead.dsl.ast: ASTNode is now a dx.TaggedUnion with kind discriminator; Literal, Variable, BinaryOp, UnaryOp, FunctionCall, ListLiteral, AttributeAccess, Subscript variants. Construction round-trips work; JSON round-trip blocked on panproto/didactic#22 (nested TaggedUnion field decoding). - bead.deployment.distribution: QuotaConfig, WeightedRandomConfig, LatinSquareConfig, MetadataBasedConfig, StratifiedConfig, ListDistributionStrategy. DistributionStrategyType (StrEnum) stays; the field type uses a parallel DistributionStrategyName Literal alias pending panproto/didactic#23 enum support. - bead.deployment.jspsych.config: SpanDisplayConfig, DemographicsFieldConfig, DemographicsConfig, InstructionPage, InstructionsConfig, ExperimentConfig, RatingScaleConfig, ChoiceConfig. ExperimentConfig.instructions now requires InstructionsConfig directly; callers wanting a string pass through InstructionsConfig.from_text(...). - bead.templates.filler.FilledTemplate: slot_fillers becomes dict[str, Embed[LexicalItem]]; unfilled_slots / unfilled_required_slots return frozensets. - bead.behavioral.analytics: JudgmentAnalytics, ParticipantBehavioralSummary, AnalyticsCollection; slopit metric classes appear as dict[str, JsonValue] payloads since they are upstream Pydantic Models. - Path-typed configuration fields (PathsConfig, ResourceConfig, LoggingConfig, TrainerConfig, TemplateConfig.mlm_cache_dir, ModelMetadata.training_data_path / eval_data_path / best_checkpoint) are stored as str pending panproto/didactic#21 (Path as a first-class field type). Callers wrap with pathlib.Path on access; profiles.py stringifies Path arguments at construction. - bead.config.profiles.get_profile drops .model_copy(deep=True); didactic Models are frozen so the registry instance can be shared directly.

…mports Converts the inline RunLoopConfig / RunModelConfig / RunSelectionConfig / RunDataConfig / ActiveLearningRunConfig models inside bead/cli/active_learning_commands.py to dx.Model. Path-typed data fields become str pending panproto/didactic#21 (callers pass strings to the loader and wrap with pathlib.Path on access). Bulk-replaces the from-pydantic ValidationError import with from-didactic in bead/cli/{config,deployment,lists,items,resources, resource_loaders,templates}.py so CLI try/except blocks catch the correct exception class. bead/deployment/jatos/exporter.JATOSExporter requires no field changes (only str fields); the Path-typed parameters in its export method are not Model fields.

Converts bead.transforms.base.TransformContext to a frozen dx.Model (language_code, lemma, pos, head_index, tokens as tuple[str, ...], metadata as dict[str, JsonValue]). InflectionSpec and MorphologicalTransform stay as plain dataclasses / classes; they hold non-serialisable state (a feature-matching callable predicate, a lazy UniMorph adapter, an instance cache) that does not fit didactic's serialisable-Model semantics. TransformPipeline and TransformRegistry stay as plain Python classes holding callables. Tests updated for tuple-shaped TransformContext.tokens.

- bead/data/repository.py: tighten generic bound to BeadBaseModel so the id field is visible to the type checker; remove attr-defined ignores. - bead/data/validation.py: replace bare Exception catch with (ValueError, TypeError); switch reference_pool / repository params to Mapping for covariance; coerce iterated container to tuple[object, ...] so iter has a known element type. - bead/participants/collection.py, bead/resources/lexicon.py: stop declaring a typed rows variable that pandas/polars can't satisfy; build the dict[str, JsonValue] explicitly via comprehension instead. - bead/resources/{lexicon,classification,template_collection}.py: drop the override ignores on __iter__; dx.Model has no __iter__ so the override is no longer overriding anything. - bead/participants/metadata_spec.py: cast int -> float at the Range[float].contains call boundary. - bead/transforms/base.py: make the SpanTextTransform protocol's __call__ positional-only so callable literals don't need parameter names matching "text"/"context". - tests: replace `obj.value = 99 # type: ignore[misc]` with setattr(obj, "value", 99); replace `TokenizerConfig(backend="unknown") # type: ignore[arg-type]` with `model_validate({"backend": ...})`; same for ctx.lemma assignment and the SampleModel extra-field test. - bead/config/defaults.py: drop the dead runtime isinstance/issubclass guard on `get_default_for_model[T]` (the T: dx.Model bound makes the check unreachable).

Updates bead/dsl/parser.py so each Lark-rule transformer passes the appropriate ``kind`` discriminator when constructing AST node variants (``Literal``, ``Variable``, ``BinaryOp``, ``UnaryOp``, ``FunctionCall``, ``ListLiteral``, ``AttributeAccess``, ``Subscript``). Adds runtime type narrowing on the ``items: list[Token | ast.ASTNode]`` parameters where the rule guarantees an AST node, replacing the implicit assumption with explicit ``isinstance`` guards that pyright sees through. Recursive AST construction is currently blocked downstream on panproto/didactic#24 (TaggedUnion field encoder snapshots variants at class-creation time, missing variants registered later in the same module).

…0 ships native support Bumps didactic floor to 0.5.0 and reverts the temporary str shims: PathsConfig, ResourceConfig, LoggingConfig.file, TrainerConfig.logging_dir, TemplateConfig.mlm_cache_dir, SimulationRunnerConfig.save_path, and ModelMetadata.{training_data_path,eval_data_path,best_checkpoint} go back to pathlib.Path. profiles.py and config.validate_paths drop their str/Path bridging. ListDistributionStrategy.strategy_type goes back to the DistributionStrategyType StrEnum directly (no parallel Literal alias). Closes panproto/didactic#21, #22, #23, #24.

Bulk-injects constraint_type="..." arguments into every constraint constructor call across tests/lists/. Fixes Range to use the documented __axioms__ tuple syntax (the previous _ordered = dx.axiom(...) attribute form was silently ignored).

…t_number

…eConstraint, OrderingConstraint, and the rest Now that didactic 0.5.1 ships null-aware comparison in axiom expressions (panproto/didactic#26), all the cross-field rules previously extracted to validate_size_constraint / validate_ordering_constraint move back to __axioms__ tuples on the model, restoring construction-time validation. Adds bound axioms for tolerance, n_quantiles, items_per_quantile, priority, min_unique_values, min_distance, max_distance, min_size, max_size, exact_size across UniquenessConstraint, BalanceConstraint, QuantileConstraint, GroupedQuantileConstraint, DiversityConstraint, SizeConstraint, OrderingConstraint, and ExperimentList. Updates assertions in tests/lists/test_constraints to match the new human-readable axiom messages.

Updates tests for TemplateCollection, Lexicon, TemplateClass, and LexicalItemClass to use the frozen-Model conventions: with_template / with_item rebinds, without_* tuple unpacking, by_id lookup, and tuple field access. Switches from_jsonl loaders to model_validate_json so UUID-typed fields parse correctly from raw JSON lines.

…validators BatchCoverageConstraint, BatchBalanceConstraint, BatchDiversityConstraint, and BatchMinOccurrenceConstraint now declare numeric-range axioms that previously lived in the pydantic Field constraints. The LanguageCode type alias is now plain str | None, with each model that uses it attaching a @dx.validates("language_code") validator that delegates to validate_iso639_code so values normalize to ISO 639-3 at construction.

Replaces model_dump(mode="json") (no longer accepted) with json.loads of model_dump_json() to coerce UUIDs to strings inside MetadataValue dicts. Rewrites add_spans_to_item and tokenize_item to construct new items via .with_(...) instead of round-tripping through model_dump, which preserves Span and other Embed-typed values without UUID serialization issues. Updates validate_constraint_satisfaction and item_passes_all_constraints to iterate the new tuple[ConstraintSatisfaction, ...] shape. Loosens categorical task-type validation to accept tuples. Updates the corresponding tests to use .with_(...) rebinds and ConstraintSatisfaction record construction.

Replaces dict-API and pydantic conventions in tests with the new didactic equivalents: tuple field equality, tuple sets in constraint context (sets are not JSON-serializable), with_(...) for mutation, ScaleBounds for TaskSpec scale, model_validate_json round-trips, and slots_match_template called explicitly. Source updates: SetMembershipConstraintBuilder and AgreementConstraintBuilder now emit sorted tuples (not sets) for context values; ItemConstructor builds ConstraintSatisfaction records; loaders.py uses with_item rebinds; the glazing FrameNet adapter str()-coerces AnnotatedText so features stay JSON-serializable.

…d fields bead/templates: replaces lexicon.items.values() with tuple iteration in strategies.py, streaming.py, and filler.py; removes unused constraint.compiled fast path (Constraint is no longer a pydantic model with computed fields). Tests rewrite lexicon.add(...) to with_item(...) rebinds, set-shaped context to sorted tuples, and items={dict} construction to items=tuple(...). validate_template_config is now invoked explicitly rather than at __init__ time.

Source updates: ModelConfig, NoiseModelConfig, SimulationRunnerConfig, TemplateConfig, and BatchCoverageConstraint declare numeric-range axioms in place of pydantic Field constraints; load_config and config_to_dict route through model_dump_json so UUID/Path values become JSON-shape; get_default_config and get_profile return fresh .with_() copies so callers never alias DEFAULT_CONFIG; LanguageCode and DeploymentConfig.jspsych_version permit None; ListDistributionStrategy.strategy_type defaults to BALANCED so YAML round-trips work without re-supplying the field; get_default_for_model raises TypeError for non-Model arguments; constructor builds ConstraintSatisfaction records via tuple comprehension. Test updates: every config.X.Y[.Z] = V mutation rewritten as nested .with_(...) calls; BeadConfig(**model_dump()) round-trips replaced with model_validate_json; tag/list field equalities updated to tuple form; expected pydantic-era error fragments ("Input should be ...") updated to didactic's ("is not in Literal ...") wording.

CLI updates: list-constraints commands now pass constraint_type tags required by the discriminated union; resources and resource_loaders use lexicon.with_item / collection.with_template rebinds; constraint file IO uses model_validate_json for line-by-line round-trips that preserve UUIDs as strings; SetMembershipConstraintBuilder coerces sets to sorted tuples for JSON-shaped contexts; deployment generate now constructs InstructionsConfig.from_text and ScaleBounds wrappers where the args were previously str/tuple shorthands. Deployment trials: serialization expands tuple[Embed[ScalePointLabel]] into the dict shape jsPsych expects, unpacks ScaleBounds into a list, walks tuple[ConstraintSatisfaction, ...] for trial metadata, and defaults task_spec.options to () so the Likert renderer does not crash on None. Test fixtures: cli/conftest.py drops fields that no longer exist on ResourceConfig and ModelConfig; deployment fixtures use ScaleBounds and ScalePointLabel instances; tests/items/ and tests/cli/ tuple assertions for item_metadata['categories'] and similar tuple-typed fields are corrected from list to tuple shape.

bead/cli: items.py, lists.py, resources.py, templates.py, simulate.py all replace dict-based round-trips through json.loads(...) + Model(**d) with Model.model_validate_json(line) so UUIDs and Paths are decoded correctly at every JSONL boundary. Templates fill command rebuilds its merged Lexicon and constraint-augmented TemplateCollection via .with_(...) instead of mutating tuple fields. simulate.configure now JSON-coerces the dump before yaml/json serialization. FilledTemplate. strategy_name gains a default so partial fixtures construct cleanly. Tests: every json.loads + Model(**data) pair in tests/cli/ converted to Model.model_validate_json (or Model.model_validate(item) for dict-typed inputs); FilledTemplate fixture mutation rewritten as .with_(id=...). conftest.py drops fields that no longer exist on ResourceConfig/ModelConfig. test_template_generation slot-variants test now expects success because the path is implemented.

bead/simulation: ordinal_scale, oracle, random, and systematic unpack ScaleBounds.min/.max instead of treating it as a 2-tuple; runner builds new SimulatedAnnotatorConfig instances via .with_(...) rather than mutation; oracle accepts list-or-tuple ground truth. bead/deployment/jspsych: trials.py serializes scale_bounds and scale_labels into list/dict shapes for jsPsych and unpacks the tuple[ConstraintSatisfaction, ...] field; randomizer.py walks OrderingPair records (before/after) instead of unpacking 2-tuples. bead/lists/constraints: UniquenessConstraint, BalanceConstraint, QuantileConstraint, GroupedQuantileConstraint and their batch peers declare priority axioms. bead/items/ordinal_scale: scale_labels emitted as tuple[ScalePointLabel,...]. bead/items/validation imports ScaleBounds. Test coverage: scale_bounds=(min, max) tuples replaced with ScaleBounds(min=..., max=...); scale_labels={int: str} dicts replaced with tuple[ScalePointLabel, ...]; OrderingConstraint(precedence_pairs=...) calls now pass OrderingPair instances and the constraint_type discriminator; instructions=str args wrapped in InstructionsConfig.from_text. Field equality assertions updated for tuple-shaped collections.

…ped scale_labels bead/items/ordinal_scale: function signatures take ScaleBounds (with default ScaleBounds(min=1, max=7)) and accept either dict[int, str] or tuple[ScalePointLabel, ...] for scale_labels, normalizing tuples to dict for downstream metadata. bead/cli/items_factories now constructs ScaleBounds when bridging int CLI options into the factory APIs. Tests: tests/items/test_ordinal_scale and tests/items/test_cloze import ScaleBounds and ScalePointLabel directly; cloze tests revert the bulk InstructionsConfig conversion (cloze still takes plain str for instructions). All 2905 tests pass across migrated dirs (1 optional-dep skip).

ruff fixed unused imports, import ordering, and a handful of style issues. The Range[T] generic was reverted from PEP 695 form back to typing.Generic[T] because didactic does not yet support PEP 695-style parameterised Models.

- Avoids B008 by hoisting ScaleBounds default into module-level _DEFAULT_SCALE_BOUNDS singleton. - Quotes the forward TransformContext annotation in jspsych/trials.py to keep ruff F821 happy with the lazy import. - Adds OrderingPair to deployment conftest imports.

Each per-task-type integration test previously mutated frozen Items inside a for loop (item.item_template_id = dummy_template.id). The mutation never made it back into items_dict, leaving the generator to look up unbound templates. Tests now build a fresh items_list and items_dict via list comprehension. categorical-pipeline asserts list | tuple for the categories metadata.

slopit-derived metric classes (KeystrokeMetrics, FocusMetrics, TimingMetrics) are now passed to JudgmentAnalytics as their .model_dump() dicts (the bead Model expects dict[str, JsonValue]). AnalysisFlag is the bead-side flag (not slopit's): its analyzer/ confidence kwargs are folded into metadata. ITEM_IDS use UUID for analytics constructors but the dataframe columns store the str form because polars cannot infer Arrow types for UUID values. Tuple-vs-list assertions on get_flag_types() are corrected.

api/index.md, api/deployment.md, api/workflows.md, api/templates.md, api/items.md, api/lists.md, api/resources.md, and api/training.md all received bulk migrations: scale_bounds tuples replaced with ScaleBounds(min=..., max=...); scale_labels dicts replaced with tuple[ScalePointLabel, ...]; constraint constructors pass the constraint_type discriminator; lexicon.items.values()-style accesses collapsed to plain tuple iteration; instructions=str wrapped in InstructionsConfig.from_text. Examples that mutated frozen models (item.item_template_id = ..., config.X.Y = ...) are rewritten as .with_(...) rebinds. Each code block's missing import is added at the top of the first code block.

didactic 0.6.0 adds @dx.model_validator and union-of-TaggedUnion-roots support. Tests stay green on 0.6.0 with no further code changes. Lint cleanup: pyproject's per-file-ignores grow tests/** = D + E501 to match the project policy that test docstrings/line length are not enforced (test names already document intent), drop the spurious lazy import of register_morphological_transforms in bead/transforms/__init__, move TransformContext to a top-level import in jspsych/trials.py, give LowerTransform/UpperTransform/CapitalizeTransform/TitleTransform and MorphologicalTransform.__repr__ short docstrings, hoist UniMorphAdapter to a top-level import (no more PLC0415 in morphology.py), break a handful of long lines (cli/active_learning_commands, cli/templates, config/active_learning, config/deployment, deployment/jspsych/config and trials, items/ordinal_scale, items/validation, simulation/ annotators/oracle), and rephrase three filter docstrings to fit 88 columns. ScalePointLabel tuples are now multi-line. Range[T] keeps typing.Generic[T] form: the PEP 695 form does not work with didactic Models (TypeVar fields cannot encode without explicit parameterisation, and the Embed[Range[T]] resolver does not yet thread through PEP 695 type-parameter machinery). The Range file gets a per-file UP046 ignore with a comment explaining the constraint.

…tic 0.6.1 Tests stay green at 2946 (1 spaCy skip). Fixes: - ItemConstructor._extract_call_args accepts list | tuple of ASTNodes. - Item factories (binary, categorical, forced_choice, multi_select) write metadata list values as tuples so dict[str, MetadataValue] matches. - cloze._extract_constraint_ids returns tuple[UUID, ...]; UnfilledSlot fixture passes constraint_ids=(). - generation.py builds the temporary lexicon directly via Lexicon(items=...) instead of mutating with .add(). MetadataValue alias drops list[T] (didactic FieldValue does not allow list, only tuple) — list-typed metadata callers are migrated to tuples. - span_labeling tokenized_elements / token_space_after typed dict[str, tuple[...]] consistently end-to-end so _validate_span_indices, with_(), and the public Item field share one shape. - bead/templates/resolver evaluate_slot_constraints accepts list | tuple Constraints (template constraints arrive as tuple[Embed[Constraint], ...]). - bead/items/forced_choice and multi_select option args constructed as tuple, matching Item.options: tuple[str, ...]. - bead/cli/items show-stats walks tuple[ConstraintSatisfaction, ...] via cs.satisfied (not the old dict.values()). - bead/lists/partitioner builds ExperimentList with tuple(constraints). - bead/items/ordinal_scale scale_labels accepts dict[int, str] | tuple[ScalePointLabel, ...] | None and normalizes to dict. Filed didactic#36 (with_() **changes: FieldValue is too tight; dict invariance bites every nested-dict-typed field). The remaining 20 pyright errors all chain from that signature.

…iance) - partitioner: walks BatchConstraint via runtime instanceof narrowing for the BatchMinOccurrenceConstraint branch, asserts constraint.priority is int before += in violation accumulation. - morphology: types ._adapter as UniMorphAdapter | None, normalizes features to dict[str, str] before calling InflectionSpec.predicate, tokenizes context.tokens explicitly to list[str] for the head-finder. - generation: introduces _coerce_to_metadata that converts list[JsonValue] to tuple[MetadataValue, ...] and recursively walks dicts so lexical feature copies obey the MetadataValue shape. Tests stay green at 2946 (1 spaCy skip).

didactic 0.6.2 lands the FieldValue Mapping[str, FieldValue] arm from issue #36, clearing the 9 remaining with_() invariance errors. The last source-side fixes: - bead/data/base.py: drops list[JsonValue] from JsonValue. didactic FieldValue does not include a list arm (lists are mutable; the model layer is tuple-only), so a JsonValue with list[X] cannot live in a FieldValue field. The change forces tuples on every JSON-ish field surface and aligns with the rest of the migration. - bead/deployment/jatos/exporter.py: build_study_json returns componentList and batchList as tuples (matching the new JsonValue). The on-disk JATOS format treats both shapes as JSON arrays. - bead/items/item.py / spans.py / item_template.py / lists/ experiment_list.py / lists/list_collection.py: move the MetadataValue type alias below the import block so ruff E402 is satisfied. Final state: 2946 tests pass (1 spaCy skip), ruff clean, pyright clean, zero `# type: ignore` / `# noqa` / per-file pragmas introduced by the migration.

Mirrors the quivers docs setup: cinder theme with docs/overrides for later customization, custom pygments and stylesheet pulled from docs/css/, and mkdocstrings tuned for numpy-style docstrings with symbol-type headings/TOC. Drops material-only navigation and palette features.

Format job ran ruff against the post-migration tree and 82 files needed reformatting; applied. CI workflows pinned to 3.13 — bead now requires-python = '>=3.14', so pip install in the typecheck/test/ docs/publish jobs was failing the version constraint. Bumped every setup-python step to 3.14.

CI's pip install ruff resolves to 0.15.2; the local venv had 0.14.14 which had slightly different formatting heuristics, so four files fell out of sync. Re-runs ruff format under 0.15.2 against extraction.py, ui/components.py, participants/collection.py, and templates/adapters/cache.py.

…earning configs Source: - bead/active_learning/config.py: VarianceComponents and MixedEffectsConfig declare numeric-range axioms (variance >= 0, n_groups >= 1, prior_variance/regularization_strength >= 0, min_samples_for_random_effects >= 1) so the active_learning model tests validate at construction time. - bead/active_learning/trainers/huggingface.py: build a JSON-shaped training-config snapshot for ModelMetadata so PosixPath values (config.output_dir, etc.) round-trip as strings, matching the dict[str, JsonValue] field type. Tests: - tests/active_learning/models/test_base.py: updated regex matches for the new axiom messages. Fixtures (tests/fixtures/api_docs/): - *.jsonl: constraint_satisfaction: {} -> [] (the field is now a tuple[ConstraintSatisfaction, ...], so the empty literal must be a list/tuple, not a dict). - constraints/verb_constraint.jsonl, templates/{generic,verbnet}_frames.jsonl: strip the legacy 'compiled' field that the new Constraint Model rejects with extra='forbid'. Docs: - Run black against every python code block in docs/**/*.md so each block satisfies pytest-codeblocks' lint step. - Add ScaleBounds / ScalePointLabel / InstructionsConfig imports to every block that uses them. - Replace mutating loops (for item in items_dict.values(): item = item.with_(...)) with comprehension rebinds so items_dict actually reflects the new template_id. - Fix training.md ForcedChoiceModelConfig examples: max_epochs -> num_epochs and the long-form mixed_effects_config arg replaced with the actual mixed_effects=MixedEffectsConfig(...) field. - lists.md basic example: replace [...] placeholder with [uuid4() for _ in range(100)] so the example actually runs. - items.md and deployment.md add_spans_to_item examples now pass TokenizerConfig(backend='whitespace') so the test environment doesn't need spaCy installed.

aaronstevenwhite added 30 commits May 5, 2026 10:50

Adds bead.transforms subsystem and prompt-reference pipelines

13e7e48

Introduces a value-level transform system with TransformContext, TransformRegistry, morphology, and text utilities. Wires it into jsPsych prompt rendering via the [[label|transform]] reference syntax, with 69 covering tests.

Updates CHANGELOG with the didactic migration summary

9a4db16

Stringifies SimulationRunnerConfig.save_path pending #21

76d8efd

Updates items tests for tuple-shaped fields and didactic ValidationError

5863b1e

Updates DSL tests for tagged-union AST construction

da8a42d

Pure-method test rewrites + ExperimentList axiom for non-negative lis…

b176bab

…t_number

Adds OrderingPair import to test_constraints

0f300af

aaronstevenwhite added 16 commits May 6, 2026 10:12

Auto-fixes ruff lint where safe (imports, formatting)

105a26f

ruff fixed unused imports, import ordering, and a handful of style issues. The Range[T] generic was reverted from PEP 695 form back to typing.Generic[T] because didactic does not yet support PEP 695-style parameterised Models.

aaronstevenwhite merged commit 79ada90 into main May 6, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrates bead to didactic from pydantic v2#3

Migrates bead to didactic from pydantic v2#3
aaronstevenwhite merged 46 commits intomainfrom
refactor/didactic-models

aaronstevenwhite commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aaronstevenwhite commented May 6, 2026

Description

Motivation

Type of Change

Checklist

Testing

Upstream issues filed and fixed during this work

Screenshots (if applicable)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant