Adds bead.protocol annotation-protocol layer (v0.4.0)#5
Merged
aaronstevenwhite merged 6 commits intomainfrom May 7, 2026
Merged
Adds bead.protocol annotation-protocol layer (v0.4.0)#5aaronstevenwhite merged 6 commits intomainfrom
aaronstevenwhite merged 6 commits intomainfrom
Conversation
The new bead.protocol package provides a type-theoretic stack for defining annotation protocols: SemanticAnchor as the question type, ProtocolContext as the dependent index, RealizationStrategy (Template / Contextual / LM) as the computational content, and DriftGuard with Structural / Embedding / Perplexity validators as the type-checker. QuestionFamily packages these together; AnnotationProtocol sequences families into the iterated dependent product, threading responses through the context so later families can condition on earlier answers. bead.evaluation gains AnnotationRecord, AnnotatorReliability, and annotator_reliability / low_entropy_annotators for per-annotator Shannon-entropy diagnostics that complement the existing InterAnnotatorMetrics. Documentation: docs/api/protocol.md, docs/api/evaluation.md update, docs/user-guide/protocols.md, mkdocs nav, and CHANGELOG entry. 95 new tests, 0 pyright / ruff errors on the new code, 0 new mkdocs strict warnings.
Round 1: Adds n_levels/labels/uniqueness invariant to ResponseEncoding via @dx.model_validator (BINARY must have 2 levels, no duplicate labels). Adds forward-only depends_on graph validation to AnnotationProtocol (rejects self-dependencies and forward / unknown references at __post_init__ and append). Round 2: Consolidates the duplicate always predicate (was defined in both context.py and realization.py); the canonical definition lives in context.py and is registered as the "always" entry, with realization.py importing it. Imports ContextPredicate from its canonical module in bead.protocol.__init__. Round 3: LMRealization now raises RuntimeError on an empty LM response instead of silently caching an empty string. Round 4: Tightens EmbeddingDriftValidator and PerplexityDriftValidator docstrings to reference the EmbeddingAdapter / PerplexityAdapter Protocols rather than naming a single adapter implementation. Cleans up DriftGuard docstring inconsistency between the (removed) Parameters section and the actual default (empty list, not None). Removes the "downstream packages" framing in the context-predicate registry doc. Test additions (12 new tests, 105 total): - ResponseEncoding validator boundary cases (mismatched n_levels, duplicate labels, BINARY with non-2 levels) - AnnotationProtocol dependency-graph rejection (self-dependency, forward dependency, unknown dependency, append paths) - LMRealization empty-response and quoted-empty-response cases
…e gaps Surfaces bead.protocol from the cross-cutting overview docs that previously didn't reference it: the README feature list, docs/index.md key-features list, docs/user-guide/index.md core-concepts section, a new "Annotation Protocols" section in docs/user-guide/concepts.md, and a new "bead/protocol/" subsection in docs/developer-guide/architecture.md plus a paragraph framing it as a cross-cutting layer that feeds Stage 3 item construction. Extends docs/user-guide/protocols.md with the pieces that were only in the auto-rendered API reference: ContextItem.attribute(), the context-predicate registry (register / get / list_context_predicates), LMClient details (caching, FIFO eviction, RuntimeError on empty / backend-failure responses), the named EmbeddingAdapter / PerplexityAdapter Protocols, the construction-time invariants on ResponseEncoding and AnnotationProtocol (forward-only depends_on, BINARY-must-have-2-levels, n_levels matches labels), the encode_response_space bridge to the modeling layer, the RecordLike Protocol consumed by ConditionalObservationValidator, and the question_name / require_min_responses refinements on low_entropy_annotators. CHANGELOG: documents the construction-time invariants, the empty-LM- response RuntimeError, the RecordLike Protocol, and the cross-link edits to overview docs.
Eliminates the parallel implementations identified in the prior review and makes bead.protocol a fully integrated part of the bead pipeline. Every integration replaces the duplicate it touches; no shims, no deprecation aliases. Single canonical sites ---------------------- - bead.labels: parse_label_refs / find_label_names / replace_label_refs with one compiled regex. The regex copies in bead.protocol.drift, bead.deployment.jspsych.trials, and bead.items.span_labeling are deleted; their three callers now use the shared parser. - bead.active_learning.models.registry: MODEL_CLASSES / CONFIG_CLASSES dicts plus model_class_for_task_type / config_class_for_task_type / model_class_for_encoding / config_class_for_encoding. The string-keyed TASK_TYPE_MODELS and TASK_TYPE_CONFIGS dicts and the dynamic _import_class helper in bead.cli.models and bead.cli.training are deleted; both CLIs now call the registry directly. - ModelOutputCache is the single canonical caching surface. LMRealization gains required model_name and ModelOutputCache parameters; the internal FIFO dict, max_cache_size, cache, clear_cache, and cache_size are deleted. New integration modules ----------------------- - bead.config.protocol: AnchorSpec / TemplateVariantSpec / FamilySpec / DriftConfig / ProtocolConfig give a declarative TOML/YAML form for the entire protocol. ProtocolConfig.build(lm_client=..., cache=...) materializes a live AnnotationProtocol. Plugged into BeadConfig.protocol so the same config drives Python and CLI. - bead.protocol.items: scale_type_to_task_type (the single ScaleType -> TaskType mapping), family_to_item_template, realization_to_item, protocol_to_item_templates, realize_protocol_to_items. - bead.deployment.protocol_trials: protocol_to_jspsych_trials packs AnnotationProtocol + contexts -> jsPsych trial dicts end-to-end. - bead.data_collection.records: jatos_results_to_annotation_records bridges JATOS results to AnnotationRecord, the canonical input to reliability and inter-annotator-agreement metrics. - bead.cli.protocol: bead protocol validate / realize / items drive ProtocolConfig from the shell. Registered in cli/main.py alongside the other stage subcommands. Tests ----- - 178 new and migrated tests across protocol, labels, config, data_collection, CLI; full suite passes (3201/3201, excluding the pre-existing pydantic-v1 spaCy and slopit and Lightning failures that exist on main). - Pyright and ruff clean on every new file. Dev deps -------- - spacy / stanza added to the dev extra so tokenizer-dependent tests can run.
Pipeline-wide integration of the bead.protocol layer: shared label parser, ModelOutputCache-backed LMRealization, single task-type → model-class registry, declarative ProtocolConfig wired into BeadConfig, item / deployment / JATOS-record / CLI bridges, and the bead protocol subcommand. Every duplicate replaced; no shims.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds the
bead.protocolpackage — a type-theoretic stack for defining annotation protocols — and wires it into every stage of the bead pipeline. Anchors define question types, contexts are dependent indices, realization strategies (template / contextual / LM) are computational content, and drift guards type-check realized prompts.QuestionFamilyandAnnotationProtocolcompose these into a sequenced, conditional pipeline.The release also eliminates several pre-existing redundancies the integration touched: three independent
[[label]]regex parsers, two parallel LM caches, and string-keyed task-type → model-class dicts replicated across the CLI.Motivation
Annotation-protocol design previously had no first-class home in bead. Anchors / drift / realization / response-encoding existed as concepts but were not expressible declaratively or composable into experimental items, deployable trials, training data, or analyses. This release makes them a first-class part of the pipeline.
Fixes #
Type of Change
What's in the release
New top-level package —
bead.protocolSemanticAnchor/ResponseSpace/SemanticPoles(type-level question spec)ProtocolContext/ContextItem+ named-predicate registryRealizationStrategyProtocol +TemplateRealization,ContextualTemplateRealization,LMRealizationDriftScore/DriftGuard+StructuralDriftValidator,EmbeddingDriftValidator,PerplexityDriftValidatorQuestionFamily,AnnotationProtocol(withdepends_ongraph validation)ScaleType/ResponseEncoding(likelihood-agnostic) +encode_response_spaceDiagnosticLevel/DiagnosticRecord/DatasetReport/ConditionalObservationValidatorCompanion module —
bead.evaluation.reliabilityAnnotationRecord,AnnotatorReliability,annotator_reliability,low_entropy_annotatorsPipeline integration
bead.labels— single canonical[[label]]parser used by drift, deployment, itemsbead.config.protocol.ProtocolConfig— declarative TOML/YAML form, plugged intoBeadConfig.protocolbead.protocol.items—family_to_item_template/realization_to_item/realize_protocol_to_items+ canonicalscale_type_to_task_typebead.active_learning.models.registry— singleMODEL_CLASSES/CONFIG_CLASSESdicts andmodel_class_for_encodingbead.deployment.protocol_trials.protocol_to_jspsych_trials— protocol → jsPsych trials end-to-endbead.data_collection.jatos_results_to_annotation_records— JATOS results →AnnotationRecordbead.cli.protocol—bead protocol validate/realize/itemsBreaking changes
LMRealizationnow requiresmodel_name: strand acceptscache: ModelOutputCache | None. The internal FIFO_cache,max_cache_size,cache: bool,clear_cache(), andcache_sizeare removed; the bead-wideModelOutputCacheis the single canonical caching surface.bead.cli.modelsno longer exposesTASK_TYPE_MODELS/TASK_TYPE_CONFIGS/_import_class. Callers usebead.active_learning.models.registrydirectly.bead.cli.trainingfollows the same pattern.bead.deployment.jspsych.trials._parse_prompt_references,_SpanReference,_SPAN_REF_PATTERN, and the duplicate_SPAN_REF_PATTERNinbead.items.span_labelingare deleted; callers usebead.labels.parse_label_refs/LabelRef.No backward-compat shims. Every call site is migrated in this PR.
Checklist
uv run ruff check .anduv run ruff format .uv run pyrightwith no errorsuv run pytest tests/)Testing
tests/protocol/,tests/test_labels.py,tests/config/test_protocol_config.py,tests/data_collection/test_records.py,tests/cli/test_protocol.py, plus migrated mocks intests/cli/test_models.pyandtests/cli/test_training.py.main:tests/active_learning/trainers/test_lightning.py,tests/active_learning/models/*save-load tests,tests/items/test_span_labeling.py::test_default_config(spaCy/pydantic-v1 conflict),tests/behavioral/*(slopit not installed)).tests/protocol/test_end_to_end.pyexercises anchor → context → realization → drift → reliability → diagnostics.test_items_bridge.py), deployment (test_deployment_bridge.py), JATOS records (test_records.py), CLI (test_protocol.py), config (test_protocol_config.py).Documentation
docs/api/protocol.md,docs/api/labels.md,docs/api/evaluation.md,docs/api/config.md,docs/api/active_learning.md,docs/api/deployment.md,docs/api/data_collection.md— auto-rendered API reference for every new module.docs/user-guide/protocols.md— narrative walkthrough including the configuration-driven workflow, the CLI, the item / deployment / JATOS bridges, and active-learning model selection.docs/user-guide/concepts.md,docs/user-guide/index.md,docs/index.md,docs/developer-guide/architecture.md,README.md— cross-linked.CHANGELOG.md—[0.4.0] - 2026-05-07entry.