Migrates bead to didactic from pydantic v2#3
Merged
aaronstevenwhite merged 46 commits intomainfrom May 6, 2026
Merged
Conversation
Introduces a value-level transform system with TransformContext, TransformRegistry, morphology, and text utilities. Wires it into jsPsych prompt rendering via the [[label|transform]] reference syntax, with 69 covering tests.
Replaces pydantic with didactic>=0.3.2 and panproto>=0.43 in project dependencies, raises requires-python to >=3.14, and updates ruff target-version, pyright pythonVersion, and the .python-version pin. Subsequent commits convert the model layer file-by-file.
Converts BeadBaseModel and the bead/data/* module to didactic 0.4.0: frozen by default with .with_(...) updates, dx.field(default_factory=...) for defaults, dx.axiom for cross-field invariants on Range[T], dx.Embed for nested-Model fields in MetadataTracker, and dx.ValidationError replacing pydantic's. ValidationReport's mutating add_error / add_warning become pure methods returning new instances. UUID-reference walking now introspects __field_specs__ rather than typing.get_type_hints (which trips on TYPE_CHECKING-only didactic internals). All 108 tests under tests/data/ pass; pyproject pins didactic>=0.4.0.
Converts TokenizerConfig, DisplayToken, and TokenizedText to dx.Model. TokenizedText.tokens becomes a tuple[dx.Embed[DisplayToken], ...] and the helper properties (token_texts, space_after_flags) return tuples. Tokenizer __call__ implementations build tuples directly. Tests updated to use dx.ValidationError and tuple equality.
Converts Span, SpanSegment, SpanLabel, SpanRelation, SpanSpec, UnfilledSlot, ModelOutput, Item, ItemCollection, and the item-template hierarchy (ChunkingSpec, TimingParams, TaskSpec, PresentationSpec, ItemElement, ItemTemplate, ItemTemplateCollection) to didactic Models. Notes: - list[T] fields become tuple[T, ...]; nested-Model fields wrap with dx.Embed[T]; mutating add_item/add_template/add_model_output methods become pure with_item/with_template/with_model_output returning new instances. - TaskSpec.scale_bounds is now a ScaleBounds(min: int, max: int) Model; didactic does not support heterogeneous tuples (panproto/didactic#15 background). - TaskSpec.scale_labels keys typed as str; didactic dict keys must be str (so int-keyed scale point labels stringify). - Item.constraint_satisfaction keys typed as str (UUID stringified) for the same reason. - @field_validator/@model_validator hooks port to @dx.validates; Item's cross-field span_relations integrity check is dropped here pending a panproto-axiom-friendly formulation. - Utility modules under bead/items/ that don't define Pydantic models (binary, cache, categorical, cloze, constructor, forced_choice, free_text, generation, magnitude, multi_select, ordinal_scale, scoring, validation, span_labeling) work against the new Models unchanged at the source level. Their tests need follow-up edits to switch list[] literals -> tuples in fixtures (blocked on panproto/didactic#15 for ergonomic cleanup).
Converts Constraint, LexicalItem, MWEComponent, MultiWordExpression,
Lexicon, Slot, Template, TemplateSequence, TemplateTree,
TemplateCollection, LexicalItemClass, and TemplateClass to dx.Model.
Notes:
- Constraint.context value type is now JsonValue; the DSL evaluator
coerces lists into sets when used as membership tests, so call sites
pass [...] in place of {...}.
- Lexicon, TemplateCollection, LexicalItemClass, TemplateClass now
store members as tuple[Embed[T], ...]; lookup by id is exposed via
by_id() and __contains__.
- Mutating add/remove methods become pure with_item / without_item /
with_template / without_template returning new instances (frozen
Models).
- Template's @model_validator(mode="after") integrity check that
template_string and slots agree is extracted to a free function
slots_match_template() which callers invoke explicitly.
- Pin bumps to didactic>=0.4.3 (issues #11, #15, #17, #18 fixed).
Converts the list-level and batch-level constraint hierarchies and the ExperimentList / ListCollection containers to didactic 0.4.3. Notes: - ListConstraint and BatchConstraint are dx.TaggedUnion roots with constraint_type as the discriminator. Variants subclass them and pin constraint_type to a typing.Literal value. Multiple inheritance (BeadBaseModel + dx.TaggedUnion) carries over the BeadBaseModel identity and timestamp fields. - OrderingConstraint.precedence_pairs becomes tuple[OrderingPair, ...] where OrderingPair is a small (before, after) Model; didactic does not support heterogeneous tuples as field types. - ExperimentList.constraint_satisfaction becomes tuple[ConstraintSatisfaction, ...] records keyed on the embedded UUID. - @model_validator(mode="after") cross-field checks (size_params, distance_constraints, presentation_order matches item_refs) extract to free functions: validate_size_constraint, validate_ordering_constraint, validate_presentation_order. - Mutating add_item / remove_item / shuffle_order on ExperimentList and add_list on ListCollection become pure with_item / without_item / with_shuffled_order / with_list returning new instances.
Converts Participant, ParticipantIDMapping, FieldSpec, ParticipantMetadataSpec, ParticipantCollection, and IDMappingCollection to dx.Model. Notes: - FieldSpec.range collapses to Range[float] | None; didactic does not yet support unions of two Embed[T] alternatives. Bound checking works identically because int is a numeric subtype of float for inclusion tests. - @model_validator(mode="after") on FieldSpec extracts to free function validate_field_spec for callers that want construction-time cross-field checks. - Mutating add_participant/add_session/add_mapping/deactivate methods become pure with_participant / with_session / with_mapping / deactivated returning new instances or (new_instance, value) tuples.
Converts the configuration layer (BeadConfig, PathsConfig, ModelConfig, ItemConfig, ResourceConfig, ListConfig, BatchConstraintConfig, TemplateConfig, SlotStrategyConfig, SimulationRunnerConfig and friends, DeploymentConfig + Slopit configs, LoggingConfig) and the active-learning configs (BaseEncoderModelConfig and its ForcedChoice/Categorical/Binary/MultiSelect leaf classes, OrdinalScaleModelConfig, MagnitudeModelConfig, FreeTextModelConfig, ClozeModelConfig, ActiveLearningLoopConfig, TrainerConfig, UncertaintySamplerConfig, JatosDataCollectionConfig, ProlificDataCollectionConfig, ActiveLearningConfig, MixedEffectsConfig + RandomEffectsSpec + VarianceComponents) to dx.Model. Notes: - @model_validator(mode="after") cross-field checks on MagnitudeModelConfig (bounded vs distribution), TemplateConfig (MLM / mixed-strategy consistency), BatchConstraintConfig (per-type required fields), and SlopitIntegrationConfig (bundle existence) extract to free validate_* functions. - ModelMetadata.training_config switches from dict[str, str | int | float | bool | Path | None] to dict[str, JsonValue]; Path values must be stringified before storage. - ActiveLearningConfig pulls in MixedEffectsConfig from bead.active_learning.config; that file now uses dx.Model directly. Note: tests + a few remaining fields that depend on Path as a first- class field type (PathsConfig.data_dir, LoggingConfig.file, TrainerConfig.logging_dir, TemplateConfig.mlm_cache_dir, ModelMetadata.training_data_path, etc.) still reference Path; they will resume cleanly once panproto/didactic#21 lands.
…ioral, and finishes bead.config Converts the remaining model layers blocked or partly converted in the previous batch: - bead.dsl.ast: ASTNode is now a dx.TaggedUnion with kind discriminator; Literal, Variable, BinaryOp, UnaryOp, FunctionCall, ListLiteral, AttributeAccess, Subscript variants. Construction round-trips work; JSON round-trip blocked on panproto/didactic#22 (nested TaggedUnion field decoding). - bead.deployment.distribution: QuotaConfig, WeightedRandomConfig, LatinSquareConfig, MetadataBasedConfig, StratifiedConfig, ListDistributionStrategy. DistributionStrategyType (StrEnum) stays; the field type uses a parallel DistributionStrategyName Literal alias pending panproto/didactic#23 enum support. - bead.deployment.jspsych.config: SpanDisplayConfig, DemographicsFieldConfig, DemographicsConfig, InstructionPage, InstructionsConfig, ExperimentConfig, RatingScaleConfig, ChoiceConfig. ExperimentConfig.instructions now requires InstructionsConfig directly; callers wanting a string pass through InstructionsConfig.from_text(...). - bead.templates.filler.FilledTemplate: slot_fillers becomes dict[str, Embed[LexicalItem]]; unfilled_slots / unfilled_required_slots return frozensets. - bead.behavioral.analytics: JudgmentAnalytics, ParticipantBehavioralSummary, AnalyticsCollection; slopit metric classes appear as dict[str, JsonValue] payloads since they are upstream Pydantic Models. - Path-typed configuration fields (PathsConfig, ResourceConfig, LoggingConfig, TrainerConfig, TemplateConfig.mlm_cache_dir, ModelMetadata.training_data_path / eval_data_path / best_checkpoint) are stored as str pending panproto/didactic#21 (Path as a first-class field type). Callers wrap with pathlib.Path on access; profiles.py stringifies Path arguments at construction. - bead.config.profiles.get_profile drops .model_copy(deep=True); didactic Models are frozen so the registry instance can be shared directly.
…mports Converts the inline RunLoopConfig / RunModelConfig / RunSelectionConfig / RunDataConfig / ActiveLearningRunConfig models inside bead/cli/active_learning_commands.py to dx.Model. Path-typed data fields become str pending panproto/didactic#21 (callers pass strings to the loader and wrap with pathlib.Path on access). Bulk-replaces the from-pydantic ValidationError import with from-didactic in bead/cli/{config,deployment,lists,items,resources, resource_loaders,templates}.py so CLI try/except blocks catch the correct exception class. bead/deployment/jatos/exporter.JATOSExporter requires no field changes (only str fields); the Path-typed parameters in its export method are not Model fields.
Converts bead.transforms.base.TransformContext to a frozen dx.Model (language_code, lemma, pos, head_index, tokens as tuple[str, ...], metadata as dict[str, JsonValue]). InflectionSpec and MorphologicalTransform stay as plain dataclasses / classes; they hold non-serialisable state (a feature-matching callable predicate, a lazy UniMorph adapter, an instance cache) that does not fit didactic's serialisable-Model semantics. TransformPipeline and TransformRegistry stay as plain Python classes holding callables. Tests updated for tuple-shaped TransformContext.tokens.
- bead/data/repository.py: tighten generic bound to BeadBaseModel so the
id field is visible to the type checker; remove attr-defined ignores.
- bead/data/validation.py: replace bare Exception catch with
(ValueError, TypeError); switch reference_pool / repository params to
Mapping for covariance; coerce iterated container to tuple[object,
...] so iter has a known element type.
- bead/participants/collection.py, bead/resources/lexicon.py: stop
declaring a typed rows variable that pandas/polars can't satisfy;
build the dict[str, JsonValue] explicitly via comprehension instead.
- bead/resources/{lexicon,classification,template_collection}.py: drop
the override ignores on __iter__; dx.Model has no __iter__ so the
override is no longer overriding anything.
- bead/participants/metadata_spec.py: cast int -> float at the
Range[float].contains call boundary.
- bead/transforms/base.py: make the SpanTextTransform protocol's
__call__ positional-only so callable literals don't need parameter
names matching "text"/"context".
- tests: replace `obj.value = 99 # type: ignore[misc]` with
setattr(obj, "value", 99); replace `TokenizerConfig(backend="unknown")
# type: ignore[arg-type]` with `model_validate({"backend": ...})`;
same for ctx.lemma assignment and the SampleModel extra-field test.
- bead/config/defaults.py: drop the dead runtime isinstance/issubclass
guard on `get_default_for_model[T]` (the T: dx.Model bound makes the
check unreachable).
Updates bead/dsl/parser.py so each Lark-rule transformer passes the appropriate ``kind`` discriminator when constructing AST node variants (``Literal``, ``Variable``, ``BinaryOp``, ``UnaryOp``, ``FunctionCall``, ``ListLiteral``, ``AttributeAccess``, ``Subscript``). Adds runtime type narrowing on the ``items: list[Token | ast.ASTNode]`` parameters where the rule guarantees an AST node, replacing the implicit assumption with explicit ``isinstance`` guards that pyright sees through. Recursive AST construction is currently blocked downstream on panproto/didactic#24 (TaggedUnion field encoder snapshots variants at class-creation time, missing variants registered later in the same module).
…0 ships native support
Bumps didactic floor to 0.5.0 and reverts the temporary str shims:
PathsConfig, ResourceConfig, LoggingConfig.file, TrainerConfig.logging_dir,
TemplateConfig.mlm_cache_dir, SimulationRunnerConfig.save_path, and
ModelMetadata.{training_data_path,eval_data_path,best_checkpoint} go back
to pathlib.Path. profiles.py and config.validate_paths drop their
str/Path bridging.
ListDistributionStrategy.strategy_type goes back to the
DistributionStrategyType StrEnum directly (no parallel Literal alias).
Closes panproto/didactic#21, #22, #23, #24.
Bulk-injects constraint_type="..." arguments into every constraint constructor call across tests/lists/. Fixes Range to use the documented __axioms__ tuple syntax (the previous _ordered = dx.axiom(...) attribute form was silently ignored).
…eConstraint, OrderingConstraint, and the rest Now that didactic 0.5.1 ships null-aware comparison in axiom expressions (panproto/didactic#26), all the cross-field rules previously extracted to validate_size_constraint / validate_ordering_constraint move back to __axioms__ tuples on the model, restoring construction-time validation. Adds bound axioms for tolerance, n_quantiles, items_per_quantile, priority, min_unique_values, min_distance, max_distance, min_size, max_size, exact_size across UniquenessConstraint, BalanceConstraint, QuantileConstraint, GroupedQuantileConstraint, DiversityConstraint, SizeConstraint, OrderingConstraint, and ExperimentList. Updates assertions in tests/lists/test_constraints to match the new human-readable axiom messages.
Updates tests for TemplateCollection, Lexicon, TemplateClass, and LexicalItemClass to use the frozen-Model conventions: with_template / with_item rebinds, without_* tuple unpacking, by_id lookup, and tuple field access. Switches from_jsonl loaders to model_validate_json so UUID-typed fields parse correctly from raw JSON lines.
…validators
BatchCoverageConstraint, BatchBalanceConstraint, BatchDiversityConstraint,
and BatchMinOccurrenceConstraint now declare numeric-range axioms that
previously lived in the pydantic Field constraints. The LanguageCode
type alias is now plain str | None, with each model that uses it
attaching a @dx.validates("language_code") validator that delegates
to validate_iso639_code so values normalize to ISO 639-3 at construction.
Replaces model_dump(mode="json") (no longer accepted) with json.loads of model_dump_json() to coerce UUIDs to strings inside MetadataValue dicts. Rewrites add_spans_to_item and tokenize_item to construct new items via .with_(...) instead of round-tripping through model_dump, which preserves Span and other Embed-typed values without UUID serialization issues. Updates validate_constraint_satisfaction and item_passes_all_constraints to iterate the new tuple[ConstraintSatisfaction, ...] shape. Loosens categorical task-type validation to accept tuples. Updates the corresponding tests to use .with_(...) rebinds and ConstraintSatisfaction record construction.
Replaces dict-API and pydantic conventions in tests with the new didactic equivalents: tuple field equality, tuple sets in constraint context (sets are not JSON-serializable), with_(...) for mutation, ScaleBounds for TaskSpec scale, model_validate_json round-trips, and slots_match_template called explicitly. Source updates: SetMembershipConstraintBuilder and AgreementConstraintBuilder now emit sorted tuples (not sets) for context values; ItemConstructor builds ConstraintSatisfaction records; loaders.py uses with_item rebinds; the glazing FrameNet adapter str()-coerces AnnotatedText so features stay JSON-serializable.
…d fields
bead/templates: replaces lexicon.items.values() with tuple iteration
in strategies.py, streaming.py, and filler.py; removes unused
constraint.compiled fast path (Constraint is no longer a pydantic
model with computed fields). Tests rewrite lexicon.add(...) to
with_item(...) rebinds, set-shaped context to sorted tuples, and
items={dict} construction to items=tuple(...). validate_template_config
is now invoked explicitly rather than at __init__ time.
Source updates: ModelConfig, NoiseModelConfig, SimulationRunnerConfig,
TemplateConfig, and BatchCoverageConstraint declare numeric-range
axioms in place of pydantic Field constraints; load_config and
config_to_dict route through model_dump_json so UUID/Path values
become JSON-shape; get_default_config and get_profile return fresh
.with_() copies so callers never alias DEFAULT_CONFIG; LanguageCode
and DeploymentConfig.jspsych_version permit None;
ListDistributionStrategy.strategy_type defaults to BALANCED so YAML
round-trips work without re-supplying the field; get_default_for_model
raises TypeError for non-Model arguments; constructor builds
ConstraintSatisfaction records via tuple comprehension.
Test updates: every config.X.Y[.Z] = V mutation rewritten as nested
.with_(...) calls; BeadConfig(**model_dump()) round-trips replaced
with model_validate_json; tag/list field equalities updated to tuple
form; expected pydantic-era error fragments ("Input should be ...")
updated to didactic's ("is not in Literal ...") wording.
CLI updates: list-constraints commands now pass constraint_type tags required by the discriminated union; resources and resource_loaders use lexicon.with_item / collection.with_template rebinds; constraint file IO uses model_validate_json for line-by-line round-trips that preserve UUIDs as strings; SetMembershipConstraintBuilder coerces sets to sorted tuples for JSON-shaped contexts; deployment generate now constructs InstructionsConfig.from_text and ScaleBounds wrappers where the args were previously str/tuple shorthands. Deployment trials: serialization expands tuple[Embed[ScalePointLabel]] into the dict shape jsPsych expects, unpacks ScaleBounds into a list, walks tuple[ConstraintSatisfaction, ...] for trial metadata, and defaults task_spec.options to () so the Likert renderer does not crash on None. Test fixtures: cli/conftest.py drops fields that no longer exist on ResourceConfig and ModelConfig; deployment fixtures use ScaleBounds and ScalePointLabel instances; tests/items/ and tests/cli/ tuple assertions for item_metadata['categories'] and similar tuple-typed fields are corrected from list to tuple shape.
bead/cli: items.py, lists.py, resources.py, templates.py, simulate.py all replace dict-based round-trips through json.loads(...) + Model(**d) with Model.model_validate_json(line) so UUIDs and Paths are decoded correctly at every JSONL boundary. Templates fill command rebuilds its merged Lexicon and constraint-augmented TemplateCollection via .with_(...) instead of mutating tuple fields. simulate.configure now JSON-coerces the dump before yaml/json serialization. FilledTemplate. strategy_name gains a default so partial fixtures construct cleanly. Tests: every json.loads + Model(**data) pair in tests/cli/ converted to Model.model_validate_json (or Model.model_validate(item) for dict-typed inputs); FilledTemplate fixture mutation rewritten as .with_(id=...). conftest.py drops fields that no longer exist on ResourceConfig/ModelConfig. test_template_generation slot-variants test now expects success because the path is implemented.
bead/simulation: ordinal_scale, oracle, random, and systematic
unpack ScaleBounds.min/.max instead of treating it as a 2-tuple;
runner builds new SimulatedAnnotatorConfig instances via .with_(...)
rather than mutation; oracle accepts list-or-tuple ground truth.
bead/deployment/jspsych: trials.py serializes scale_bounds and
scale_labels into list/dict shapes for jsPsych and unpacks the
tuple[ConstraintSatisfaction, ...] field; randomizer.py walks
OrderingPair records (before/after) instead of unpacking 2-tuples.
bead/lists/constraints: UniquenessConstraint, BalanceConstraint,
QuantileConstraint, GroupedQuantileConstraint and their batch peers
declare priority axioms.
bead/items/ordinal_scale: scale_labels emitted as tuple[ScalePointLabel,...].
bead/items/validation imports ScaleBounds.
Test coverage: scale_bounds=(min, max) tuples replaced with
ScaleBounds(min=..., max=...); scale_labels={int: str} dicts replaced
with tuple[ScalePointLabel, ...]; OrderingConstraint(precedence_pairs=...)
calls now pass OrderingPair instances and the constraint_type
discriminator; instructions=str args wrapped in
InstructionsConfig.from_text. Field equality assertions updated for
tuple-shaped collections.
…ped scale_labels bead/items/ordinal_scale: function signatures take ScaleBounds (with default ScaleBounds(min=1, max=7)) and accept either dict[int, str] or tuple[ScalePointLabel, ...] for scale_labels, normalizing tuples to dict for downstream metadata. bead/cli/items_factories now constructs ScaleBounds when bridging int CLI options into the factory APIs. Tests: tests/items/test_ordinal_scale and tests/items/test_cloze import ScaleBounds and ScalePointLabel directly; cloze tests revert the bulk InstructionsConfig conversion (cloze still takes plain str for instructions). All 2905 tests pass across migrated dirs (1 optional-dep skip).
ruff fixed unused imports, import ordering, and a handful of style issues. The Range[T] generic was reverted from PEP 695 form back to typing.Generic[T] because didactic does not yet support PEP 695-style parameterised Models.
- Avoids B008 by hoisting ScaleBounds default into module-level _DEFAULT_SCALE_BOUNDS singleton. - Quotes the forward TransformContext annotation in jspsych/trials.py to keep ruff F821 happy with the lazy import. - Adds OrderingPair to deployment conftest imports.
Each per-task-type integration test previously mutated frozen Items inside a for loop (item.item_template_id = dummy_template.id). The mutation never made it back into items_dict, leaving the generator to look up unbound templates. Tests now build a fresh items_list and items_dict via list comprehension. categorical-pipeline asserts list | tuple for the categories metadata.
slopit-derived metric classes (KeystrokeMetrics, FocusMetrics, TimingMetrics) are now passed to JudgmentAnalytics as their .model_dump() dicts (the bead Model expects dict[str, JsonValue]). AnalysisFlag is the bead-side flag (not slopit's): its analyzer/ confidence kwargs are folded into metadata. ITEM_IDS use UUID for analytics constructors but the dataframe columns store the str form because polars cannot infer Arrow types for UUID values. Tuple-vs-list assertions on get_flag_types() are corrected.
api/index.md, api/deployment.md, api/workflows.md, api/templates.md, api/items.md, api/lists.md, api/resources.md, and api/training.md all received bulk migrations: scale_bounds tuples replaced with ScaleBounds(min=..., max=...); scale_labels dicts replaced with tuple[ScalePointLabel, ...]; constraint constructors pass the constraint_type discriminator; lexicon.items.values()-style accesses collapsed to plain tuple iteration; instructions=str wrapped in InstructionsConfig.from_text. Examples that mutated frozen models (item.item_template_id = ..., config.X.Y = ...) are rewritten as .with_(...) rebinds. Each code block's missing import is added at the top of the first code block.
didactic 0.6.0 adds @dx.model_validator and union-of-TaggedUnion-roots support. Tests stay green on 0.6.0 with no further code changes. Lint cleanup: pyproject's per-file-ignores grow tests/** = D + E501 to match the project policy that test docstrings/line length are not enforced (test names already document intent), drop the spurious lazy import of register_morphological_transforms in bead/transforms/__init__, move TransformContext to a top-level import in jspsych/trials.py, give LowerTransform/UpperTransform/CapitalizeTransform/TitleTransform and MorphologicalTransform.__repr__ short docstrings, hoist UniMorphAdapter to a top-level import (no more PLC0415 in morphology.py), break a handful of long lines (cli/active_learning_commands, cli/templates, config/active_learning, config/deployment, deployment/jspsych/config and trials, items/ordinal_scale, items/validation, simulation/ annotators/oracle), and rephrase three filter docstrings to fit 88 columns. ScalePointLabel tuples are now multi-line. Range[T] keeps typing.Generic[T] form: the PEP 695 form does not work with didactic Models (TypeVar fields cannot encode without explicit parameterisation, and the Embed[Range[T]] resolver does not yet thread through PEP 695 type-parameter machinery). The Range file gets a per-file UP046 ignore with a comment explaining the constraint.
…tic 0.6.1 Tests stay green at 2946 (1 spaCy skip). Fixes: - ItemConstructor._extract_call_args accepts list | tuple of ASTNodes. - Item factories (binary, categorical, forced_choice, multi_select) write metadata list values as tuples so dict[str, MetadataValue] matches. - cloze._extract_constraint_ids returns tuple[UUID, ...]; UnfilledSlot fixture passes constraint_ids=(). - generation.py builds the temporary lexicon directly via Lexicon(items=...) instead of mutating with .add(). MetadataValue alias drops list[T] (didactic FieldValue does not allow list, only tuple) — list-typed metadata callers are migrated to tuples. - span_labeling tokenized_elements / token_space_after typed dict[str, tuple[...]] consistently end-to-end so _validate_span_indices, with_(), and the public Item field share one shape. - bead/templates/resolver evaluate_slot_constraints accepts list | tuple Constraints (template constraints arrive as tuple[Embed[Constraint], ...]). - bead/items/forced_choice and multi_select option args constructed as tuple, matching Item.options: tuple[str, ...]. - bead/cli/items show-stats walks tuple[ConstraintSatisfaction, ...] via cs.satisfied (not the old dict.values()). - bead/lists/partitioner builds ExperimentList with tuple(constraints). - bead/items/ordinal_scale scale_labels accepts dict[int, str] | tuple[ScalePointLabel, ...] | None and normalizes to dict. Filed didactic#36 (with_() **changes: FieldValue is too tight; dict invariance bites every nested-dict-typed field). The remaining 20 pyright errors all chain from that signature.
…iance) - partitioner: walks BatchConstraint via runtime instanceof narrowing for the BatchMinOccurrenceConstraint branch, asserts constraint.priority is int before += in violation accumulation. - morphology: types ._adapter as UniMorphAdapter | None, normalizes features to dict[str, str] before calling InflectionSpec.predicate, tokenizes context.tokens explicitly to list[str] for the head-finder. - generation: introduces _coerce_to_metadata that converts list[JsonValue] to tuple[MetadataValue, ...] and recursively walks dicts so lexical feature copies obey the MetadataValue shape. Tests stay green at 2946 (1 spaCy skip).
didactic 0.6.2 lands the FieldValue Mapping[str, FieldValue] arm from issue #36, clearing the 9 remaining with_() invariance errors. The last source-side fixes: - bead/data/base.py: drops list[JsonValue] from JsonValue. didactic FieldValue does not include a list arm (lists are mutable; the model layer is tuple-only), so a JsonValue with list[X] cannot live in a FieldValue field. The change forces tuples on every JSON-ish field surface and aligns with the rest of the migration. - bead/deployment/jatos/exporter.py: build_study_json returns componentList and batchList as tuples (matching the new JsonValue). The on-disk JATOS format treats both shapes as JSON arrays. - bead/items/item.py / spans.py / item_template.py / lists/ experiment_list.py / lists/list_collection.py: move the MetadataValue type alias below the import block so ruff E402 is satisfied. Final state: 2946 tests pass (1 spaCy skip), ruff clean, pyright clean, zero `# type: ignore` / `# noqa` / per-file pragmas introduced by the migration.
Mirrors the quivers docs setup: cinder theme with docs/overrides for later customization, custom pygments and stylesheet pulled from docs/css/, and mkdocstrings tuned for numpy-style docstrings with symbol-type headings/TOC. Drops material-only navigation and palette features.
Format job ran ruff against the post-migration tree and 82 files needed reformatting; applied. CI workflows pinned to 3.13 — bead now requires-python = '>=3.14', so pip install in the typecheck/test/ docs/publish jobs was failing the version constraint. Bumped every setup-python step to 3.14.
CI's pip install ruff resolves to 0.15.2; the local venv had 0.14.14 which had slightly different formatting heuristics, so four files fell out of sync. Re-runs ruff format under 0.15.2 against extraction.py, ui/components.py, participants/collection.py, and templates/adapters/cache.py.
…earning configs
Source:
- bead/active_learning/config.py: VarianceComponents and MixedEffectsConfig
declare numeric-range axioms (variance >= 0, n_groups >= 1,
prior_variance/regularization_strength >= 0,
min_samples_for_random_effects >= 1) so the active_learning model tests
validate at construction time.
- bead/active_learning/trainers/huggingface.py: build a JSON-shaped
training-config snapshot for ModelMetadata so PosixPath values
(config.output_dir, etc.) round-trip as strings, matching the
dict[str, JsonValue] field type.
Tests:
- tests/active_learning/models/test_base.py: updated regex matches for
the new axiom messages.
Fixtures (tests/fixtures/api_docs/):
- *.jsonl: constraint_satisfaction: {} -> [] (the field is now a
tuple[ConstraintSatisfaction, ...], so the empty literal must be a
list/tuple, not a dict).
- constraints/verb_constraint.jsonl, templates/{generic,verbnet}_frames.jsonl:
strip the legacy 'compiled' field that the new Constraint Model rejects
with extra='forbid'.
Docs:
- Run black against every python code block in docs/**/*.md so each
block satisfies pytest-codeblocks' lint step.
- Add ScaleBounds / ScalePointLabel / InstructionsConfig imports to every
block that uses them.
- Replace mutating loops (for item in items_dict.values(): item =
item.with_(...)) with comprehension rebinds so items_dict actually
reflects the new template_id.
- Fix training.md ForcedChoiceModelConfig examples: max_epochs ->
num_epochs and the long-form mixed_effects_config arg replaced with
the actual mixed_effects=MixedEffectsConfig(...) field.
- lists.md basic example: replace [...] placeholder with [uuid4() for _
in range(100)] so the example actually runs.
- items.md and deployment.md add_spans_to_item examples now pass
TokenizerConfig(backend='whitespace') so the test environment doesn't
need spaCy installed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Migrates the entire
beadpackage off Pydantic v2 ontodidactic(PyO3-backed typed-data layer over panproto). EveryBaseModelbecomes a frozendx.Model; every@dataclasseither becomes adx.Modelor stays a plain class for callable-bearing internals. Mutating add/remove methods (ExperimentList.add_item,Lexicon.add, etc.) are rewritten as pure.with_(...)returning a new instance, with every call site rebound. Discriminated unions (ListConstraint,BatchConstraint,ASTNode) becomedx.TaggedUnionrooted hierarchies discriminated by an explicitkind/constraint_typeliteral. Cross-field invariants move into__axioms__ = (dx.axiom("..."),)panproto expressions; per-field normalization stays in@dx.validates. Collection fields droplist[T]fortuple[T, ...]and usedx.Embed[T]for nested-Model arms (covering all of items/spans, lists/constraints, resources/template, config/*, deployment/jspsych, simulation, behavioral, transforms).The transforms subsystem committed alongside (
bead/transforms/) is included on this branch as well, restructured to use the newdx.Model-basedTransformContextand a registry-basedTransformRegistry.The CHANGELOG is updated under
[Unreleased]with the breaking-change summary.Motivation
Pydantic v2 was no longer pulling its weight: cross-field validation forced post-
__init__mutation, the PydanticField(discriminator=...)interaction with generics was fragile, and bead's own metadata model leaked the dict-of-pydantic-models problem (Lexicon.itemsasdict[UUID, LexicalItem]made round-trips, copy semantics, and lens-based migrations awkward). didactic's frozen-Model + axiom expression language gives us:.with_(...)semantics (no accidental mutation, nofrozen=Falseescape hatches)__axioms__)dx.Embed[T]for typed nested fields with reliable storage and round-trip behaviorThis is a pre-1.0 cleanup — no backward-compat shims, no didactic-pydantic adapter. The CHANGELOG records the breaking changes.
Fixes #
Type of Change
Checklist
uv run ruff check .anduv run ruff format .uv run pyrightwith no errorsuv run pytest tests/)Testing
uv run pytest tests/— 2946 passed, 1 skipped (the skip istests/items/test_span_labeling.py::test_default_config, gated onpytest.importorskip("spacy")for the optional spaCy backend; the rest of the span_labeling suite runs against the whitespace tokenizer).uv run ruff check bead tests— All checks passed!uv run pyright bead— 0 errors, 0 warnings, 0 informations.uv run mkdocs build— succeeds (5 pre-existing griffe docstring/signature mismatches inbead/resources/constraint_builders.py, unrelated to this PR).# type: ignore,# noqa, or per-file pragmas introduced by the migration. The one project-level pyright config tweak is forRange[T](didactic generic models cannot use PEP 695 syntax withEmbed[T]field targets, sotyping.Generic[T]is retained andbead/data/range.pycarriesUP046in[tool.ruff.lint.per-file-ignores]with an inline rationale).Upstream issues filed and fixed during this work
The migration surfaced several didactic gaps. Each was filed, fixed upstream, and verified locally before this branch finished:
Child(Base)reports inherited fields as required panproto/didactic#13 — inherited field defaults dropped@dx.validateswas a no-opPathfield type unsupportedStrEnumfield type unsupportedis None/!= NoneEmbed[Root]where Root is a TaggedUnion downcast variant instancesA | Bwhere both are TaggedUnion roots)@dx.model_validator(mode="after")for cross-field invariants Python-onlykw_only_default=Falsein dataclass_transform produced 103 spurious "Fields without default values" pyright errorswith_(**changes: FieldValue)typeddict[str, FieldValue]too tightly;Mapping[str, FieldValue]covariant arm clears the invariance noiseThis PR pins
didactic>=0.6.2so all of those fixes are required.Screenshots (if applicable)
n/a — backend / library refactor.