✨ feat(workflow): reuse accepted goals as versioned, schedulable workflows (#594)#636
Conversation
…ribution, dispatch_kind (#594) Phase 1 of workflow-as-entity: the definition entity (agent_workflow, immutable version rows, owner XOR check, UNIQUE NULLS NOT DISTINCT on owner/name/version), the instantiation ledger (agent_workflow_run with (workflow_id, idempotency_key) uniqueness and claimed/materializing/ done/failed/skipped states), run attribution columns on agent_goal, sched_job.dispatch_kind, and sched_job_run.root_goal_id, plus sqlc queries (ClaimWorkflowRun uses xmax=0 to distinguish fresh claims in one round trip).
…resume instantiation (#594) Phase 2 of workflow-as-entity: FrozenPlan DTO (strict decode, per-layer ValidateDecomposition reuse, FullyFrozen, content hash), input signature with allowlisted {{inputs.*}} substitution (title/intent/judgment prompt/rubric only — deterministic command excluded; unresolved or unknown placeholders are hard errors), SaveGoalAsWorkflow (accepted composite → recursive structural snapshot), and Instantiate with the claim-resume protocol: ClaimWorkflowRun dedupes by idempotency key, SetWorkflowRunRoot is a CAS (root_goal_id IS NULL) so a lost race deletes the loser's draft root and continues the idempotent walk against the winner's tree; nil-plan composites stay draft+unplanned for the dispatcher's planner path. Goal-side surface is two narrow hooks (MaterializeFrozenLayer, ActivateFrozenComposite) plus workflow attribution on CreateInput.
Phase 3 of workflow-as-entity: OpenAPI-first endpoints (save-as-workflow, list/get/delete workflows, list runs, instantiate with idempotency_key — replays return 200, ErrRunAlreadyFailed and delete-with-runs map to 409), handlers wired like goals (scoped boundary, cursor pagination, UTC timestamps), a workflows credential scope, and the stella workflow save/list/show/run CLI.
…-healing overlap policy (#594) Phase 4 of workflow-as-entity: scheduled jobs learn a second dispatch kind. Workflow jobs carry {workflow_id, inputs} in payload, reject a chat message, and enforce the fully_frozen gate at creation (allow_replan opts a partially frozen workflow in). Dispatch resolves the sched_job_run id from a typed context, uses it as the instantiation idempotency key, and records root_goal_id back onto the run row. The cross-run overlap policy is status-aware and self-healing: a previous run that is done with a non-terminal tree skips this fire; a stalled claimed/materializing run is resumed via its original idempotency key instead of minting a second tree; failed/skipped runs never block the schedule. The scheduler stays decoupled from internal/workflow behind a WorkflowRunner interface wired in the stellad composition root. Workflow deletion now also refuses while enabled workflow jobs reference the id. CLI: stella scheduler add --workflow/--input/ --allow-replan.
…ge, run history (#594) Also fixes two instantiation bugs found in browser e2e: - run root now inherits the workflow's own agent instead of the caller's scope (PAT callers have no agent scope → FK violation) - run root intent now goes through input substitution like child nodes ({{inputs.*}} placeholders were landing raw on the root goal)
📊 Coverage ReportTotal coverage: 49.8% (generated files excluded) Lowest-covered entries (first 200) |
… web UI (#594) Save as workflow from an accepted composite goal's page (name slug prefill, optional inputs editor), plus a read-only workflow detail page: metadata, inputs table, frozen-plan tree, run history, and run-with-inputs / delete actions. The run dialog holds one idempotency key per open so a retry after a lost response resumes the same run instead of minting a second tree. Workflow routes move to a directory (goals-style siblings) — the flat workflows.$workflowId file nested under the list route, which has no Outlet, so the detail page never rendered.
…#594) Address adversarial review findings on the workflow-as-entity PR: - Derive the run root goal id deterministically from the run id (sha256-based) and create it via idempotent insert, closing the crash window between root creation and run binding that could mint a second root. Remove the CAS-loser draft-root delete: with deterministic ids the loser holds the same row as the winner. - Exclude workflow roots (workflow_id IS NOT NULL) from the autonomous decomposition dispatcher; they are materialized only by workflow replay. - Instantiate consults the existing run by idempotency key before resolving caller inputs, and returns a created flag; the handler drops its racy pre-check and uses the flag for 201/200. - POST /api/goals/{id}/save-as-workflow now requires workflows:write instead of goals:write. - Scheduler workflow-job validation and not-found errors map to 400/404 instead of 500. - Retry version allocation on unique violation (3 attempts) instead of failing on concurrent saves. - Validate frozen-plan depth against the stored convergence policy, not the package default. - Web: keyed remount for the workflow detail route; run numbering uses the new WorkflowRunList.total instead of page length; goal lineage badge falls back to name-only when the run is outside the fetched page; run-dialog idempotency key survives dialog reopen and regenerates on input change or success. - Docs/skill: overlap wording now states failed instantiation does not block the next tick and stalled instantiation resumes.
Adversarial review round (3 parallel reviews, cross-verified)All findings fixed in 812859b. Highlights: Blockers
Should-fixes
New tests: crash-resume binds the pre-created root, root-race convergence, dispatcher exclusion, done-replay idempotency, save-as-workflow scope. Gates: |
…idate inputs at save (#594) Holistic self-review findings: - Nested frozen composites were dispatcher bait: between walk transactions (or across a crash-resume gap) a composite child carrying a frozen sub-plan sat draft/unplanned without workflow_id, so scanAndDecompose would claim it and LLM-replan a node the workflow promised to replay deterministically. MaterializeFrozenLayer now takes a FrozenStamp and marks those children with the workflow identity in the same tx that creates them; ListDecomposableComposites already filters stamped goals. Children without a frozen sub-plan stay unstamped and planner-eligible (partial freeze). - SubmitDecomposition re-reads the composite under the row lock and fails closed (ErrInvalidTransition) when planned_at is already set, so a late decomposition submit can never overwrite an installed frozen plan and create a second content-keyed children set. - Input specs are validated at save time: names must match [A-Za-z0-9_-]+ and be unique, and every {{inputs.name}} referenced by the frozen plan or root intent must be declared. Previously a bad name saved fine and failed on every instantiation. - Caller-fixable input errors (bad spec, unknown/missing input, unresolved placeholder) now wrap ErrInvalidWorkflowInput and map to 400; save on a non-accepted goal maps to 409 instead of 500; version conflict maps to 409. UpdateSchedulerJob maps workflow validation/not-found errors to 400/404 (UpdateUserJob revalidates dispatch on every update, including enable toggles). - Web: the lineage badge stays a run-root affordance now that nested frozen children also carry workflow_id.
Holistic self-review round → fixed in e44838aA full multi-lens pass over the PR found one design gap the previous review rounds missed, plus two smaller issues: F1 — nested frozen composites could be hijacked by the decomposition dispatcher. The walk materializes layer by layer in separate transactions. After a parent layer committed, a composite child carrying a frozen sub-plan sat F2 — input specs were unvalidated at save. A spec name outside F3 — error taxonomy. Caller-fixable input errors wrap New tests: mid-walk dispatcher-exclusion (stamped frozen child excluded, dynamic sibling still eligible), late-decomposition-submit fails closed without clobbering frozen children, stamp assertions on the instantiate path, save-time spec/placeholder validation. Gates: |
Live-fire scheduler e2e — PASSEDRan the never-exercised path end-to-end on the dev stack (real binary, real PostgreSQL, real River tick):
One finding (pre-existing interaction, not a regression): one-shot All seeded/instantiated rows cleaned up after the test (trees cancelled then deleted, workflow + job + forged session removed). |
Goals and workflows read as two disconnected systems: clicking a run
threw the user into the Goals tab with no way back, the runs table
reported instantiation status ("Done") while the goal tree was
cancelled, and scheduled runs flooded the Goals overview.
- Runs table: derived status from the root goal (ListWorkflowRuns joins
agent_goal; WorkflowRun gains root_lifecycle/block_reason/done_reason),
run # links, inputs summary, in-page pagination instead of the
goals/all?workflow_id jump
- Goal page: run roots get a back link and lineage badge pointing to the
workflow detail page instead of the goals list
- Goals overview: terminal workflow run roots collapse to one row per
workflow with a run-count badge, linking to the workflow
- Schedule button on the workflow detail page creates a workflow
scheduler job (name, cron/every/at picker, inputs, allow-replan for
partially frozen workflows)
- Dialog fix: Form in Run/Save-as-workflow dialogs now uses
display:contents so the footer stays inside the popup card; input
defaults render an em dash instead of "No data"; frozen-plan edges
render as readable sentences with node titles
Refs #594
UI de-fragmentation round (a4a150e)User-reported problem: goals and workflows read as two disconnected systems — workflow runs lived over in the Goals area, and the two surfaces disagreed about what a run's status was. Findings from a full UI walkthrough (all verified live against seeded data):
Fixes:
Gates: format, build, tests, |
…ial tasks scope
DefaultSandboxScopes never got workflows:* when the workflows resource
landed, so every in-sandbox call to /api/workflows and
/api/goals/{id}/save-as-workflow was denied by MatchScope — the
"save this goal as a workflow" flow the skill teaches could not work
from inside a sandbox.
While in the catalog, drop the "tasks" resource: no /api/tasks route
exists, so the scope was grantable but useless. Validation only runs at
PAT-creation / OAuth-authorization time, so stored tokens carrying
tasks:* keep working (the scope simply matches nothing); an OAuth
client registered with tasks:* would fail new authorizations, but no
such client can exist meaningfully since the scope never had routes.
One noun per concept everywhere users read: Goal/目标 for tracked work,
Workflow/工作流 for a frozen reusable definition, Run/运行 for one
execution, Schedule/定时任务 for the time trigger. "Task" survives only
as internal wire values (session kind, stella task CLI alias) and the
docs slug.
- i18n: delete 64 orphaned keys (dead automations.* surface, unused
scheduler task toasts, hub task-mixing copy); reword live keys
(hub.kindSchedule, hub.deleteConfirm, subtasks -> child goals); drop
the "Automations" kicker over the Goals and Workflows pages; the
session inspector sr-only title now says Workspace, matching its
toggle.
- API spec: Goal description no longer calls a child "a former task";
the session kind enum documents "task" as the goal-worker wire value.
- Inbox: failed scheduler runs deep-link to the schedule detail page
instead of the dead /agents/{id}/tasks path.
- CLI: root help says "manage goals, schedules".
- Docs: core-concepts merges the orphaned Task definition into Goal;
scheduling docs point at the Goals tab (the tab's actual name);
EN/ZH index and ZH titles updated.
- Skill/system prompt: "Tasks tab" -> "Goals tab";
references/tasks.md -> references/goals.md.
|
Concept-unification round (62dde14, 6fd9f64), following the naming audit across UI / API / CLI / docs / skill: External vocabulary is now one noun per concept: Goal (tracked work), Workflow (frozen reusable definition), Run (one execution), Schedule (time trigger). "Task" survives only as internal wire values (session kind,
|
…al model V's call: Goal and Workflow as sibling top-level tabs is one concept too many. The merge is navigational, not ontological — a workflow is still a frozen definition and a goal still an execution, but workflow surfaces are now facets of the single Goals tab: - The Workflows facet tab is gone; the Goals tab stays active on /workflows/* routes, and the old list URL redirects to the overview. - The Goals overview gains a "Repeatable (workflows)" section (name, version, run count, partly-frozen badge) that renders nothing until the first workflow exists — a user who never saved one never meets the concept. - The workflow detail page is unchanged and reachable from the section, run lineage badges, and schedule rows; deleting a workflow now lands on the goals overview instead of the removed list page.
The nouns were cleanly separated but the choice was untaught: for a repeat request the agent had three roads (scheduler chat job, new goal, workflow run) and no rule. Wrong picks don't error — they replan drift or mint duplicate goals. Rule now in SKILL.md and the system prompt: same plan with only text inputs changing -> workflow schedule; re-think each time -> chat job; "run it again" -> workflow run, never a duplicate goal.
|
Navigational merge (1d44e41, 0eb9b1c): Goal and Workflow as sibling top-level tabs was one concept too many (V). The merge is navigational, not ontological — the Workflows facet tab is gone; the Goals overview gains a "Repeatable (workflows)" section that renders nothing until the first workflow exists, so a user who never saved one never meets the concept. The workflow detail page is unchanged (reached via the section, run lineage badges, and schedule rows); the old list URL redirects to the goals overview. Verified live: tab bar, section rendering, detail page with Goals tab kept active. Also taught the agent the goal-vs-workflow decision rule in SKILL.md + system prompt (same plan with only inputs changing → workflow schedule; re-think each time → chat job; "run it again" → |
Deleting a one-time job after it fires cascaded away its sched_job_run rows (ON DELETE CASCADE), wiping the freshly written run record — and with it the root_goal_id attribution for one-shot workflow jobs — and breaking "run now" on the fired job. Retire the job by marking it disabled instead: run history, workflow run attribution, and run-now survive. A disabled past-timestamp job can never re-fire — startup only arms enabled jobs, the River worker guards on Enabled at fire time, and re-enabling is rejected while the timestamp is in the past.
|
One-shot CASCADE attribution fix (1caab87): fired one-time ( Why disable is safe against re-arming: startup only schedules No schema change needed — the CASCADE FK now only fires on explicit user deletion, where dropping run history with the job is the intended semantics ( |
Retired one-time jobs sit in the list as disabled rows with a past timestamp, so toggling one back on is now a routine user action; it hit UpdateUserJob's reschedule path and surfaced as a 500. Export ErrOneTimeJobPast and map it to a 400 telling the user to set a new time.
TestExecutorWaitsForOneShotSessionClose raced callbackCalled against returned in one select after releasing the close: both channels can be ready by the time the test goroutine is scheduled, and a random pick of returned misreported correct ordering as "Execute returned before sandbox callback" (flaked on CI under -race). The callback runs synchronously on the chat goroutine before the event channel closes, so it happens-before Execute returns — wait for the return, then assert the callback fired with a non-blocking check.
What
Implements workflow-as-entity (#594): an accepted composite goal can be saved as a reusable, versioned workflow and re-run — manually or on a schedule — as a fresh goal tree with the same frozen plan.
agent_workflow(immutable version rows: name, version, inputs,payload_format='frozen/v0', FrozenPlan payload,fully_frozen),agent_workflow_run(instantiation ledger,UNIQUE(workflow_id, idempotency_key)),workflow_id/workflow_versionattribution columns onagent_goal,sched_job.dispatch_kind(chat|workflow),sched_job_run.root_goal_id.FrozenPlansnapshot/validation, text-level inputs ({{inputs.name}}substitution over a closed allowlist: title / intent / judgment prompt / rubric — never deterministic commands; unresolved placeholder = hard error),SaveGoalAsWorkflow, and claim→materialize→doneInstantiatewith full crash-resume.workflowscredential scope),stella workflow save|list|show|run.stella scheduler add --workflow <id> --cron ...; the scheduler owns time, the workflow owns structure. Kind-aware job validation,fully_frozengate (partially frozen requires explicitallow_replan), and a status-aware overlap policy.Why
Users ask "keep this goal and run it every morning." That decomposes into definition / trigger / instance — none of which existed: goals were one-shot trees, and re-running meant re-planning from scratch with planner drift. This PR adds the missing entity (definition), reuses the scheduler as the trigger, and makes every run a fresh, attributable goal tree (instance). Done goals are never reopened.
How
Key invariants and mechanics:
fully_frozen(every composite has a saved sub-plan) is required for scheduled runs by default, so the same structure replays without planner drift. A nil sub-plan composite staysdraftwithplanned_at IS NULLso the planner picks it up at runtime.Instantiateclaims a run row viaINSERT ON CONFLICT(+xmax = 0to detect the claim), sets the root via CAS (WHERE root_goal_id IS NULL); a concurrent loser deletes its own draft orphan and continues the walk on the winner's root. The layer walk is idempotent (planned_at fence + deterministic child IDs), and resume re-substitutes from the inputs stored on the run row.idempotency_key = sched_job_run.id.WorkflowRunnerinterface; the adapter is wired in the composition root, sointernal/schedulerdoes not importinternal/workflow.Three bugs were caught during adversarial review + live e2e and fixed in-branch: a double-root race in
Instantiate, a permanent-skip deadlock in the scheduler after a mid-claim crash, and the run root not inheriting the workflow's agent / not substituting inputs into the root intent.Verified:
mise run format && mise run build && mise run testgreen; live e2e (CLI save/run, idempotent replay, scope enforcement 403, scheduler validation) plus browser verification of the list page, lineage badge, and filtered run history.Refs