Skip to content

fix: correct workflow token counters#718

Open
ArshAnan wants to merge 4 commits intopensarai:canaryfrom
ArshAnan:ioTokenCounterNotPopulate
Open

fix: correct workflow token counters#718
ArshAnan wants to merge 4 commits intopensarai:canaryfrom
ArshAnan:ioTokenCounterNotPopulate

Conversation

@ArshAnan
Copy link
Copy Markdown
Contributor

@ArshAnan ArshAnan commented May 6, 2026

What does this PR do?

Fixes #671.

This PR fixes token counter behavior for /pentest workflow runs. The bug had three related symptoms:

  1. When /pentest launched workflow-managed agents, the footer token counters could stay at zero because usage from those workflow agents was not reliably flowing back into the same counter path used by normal operator messages.
  2. Token totals could be inaccurate because the UI always calculated total tokens as inputTokens + outputTokens, even when the AI SDK/provider reported an explicit totalTokens value.
  3. Token counts could carry over or be overwritten incorrectly across sessions because the operator context keeps token usage in shared UI state, while workflow metrics are also persisted separately to session files.

This branch fixes those paths in a few places:

  • src/core/agents/offSecAgent/tools/types.ts

    • Adds onStepFinish and onCacheMetrics to the tool context so tools that spawn internal agents can forward usage and cache metrics back to the parent operator run.
  • src/core/agents/offSecAgent/offensiveSecurityAgent.ts

    • Passes the parent onStepFinish and onCacheMetrics callbacks into tool creation.
    • This lets tools such as run_pentest_workflow report token usage from nested workflow agents instead of isolating that usage inside the tool call.
  • src/core/agents/offSecAgent/tools/runPentestWorkflow.ts

    • Passes ctx.onStepFinish and ctx.onCacheMetrics into the deterministic pentest workflow.
    • This is the key bridge from the /pentest tool call back into the operator dashboard’s token accounting.
  • src/core/workflows/pentest.ts

    • Adds workflow-local token accumulation from every workflow step.
    • Wraps onStepFinish so each discovery/swarm agent step contributes to workflow metrics and still forwards to the parent UI callback.
    • Threads the wrapped callback through blackbox discovery, whitebox discovery, plan agents, and pentest swarm agents.
    • Persists final execution metrics at the end of the workflow.
    • Uses preserveLargerTokenUsage: true when writing workflow metrics so a workflow-local total cannot overwrite larger totals already recorded by the operator UI.
  • src/tui/context/agent.tsx

    • Extends addTokenUsage(input, output) to accept an optional total.
    • When totalTokens is provided by the SDK, the footer now uses that reported value instead of always deriving total as input + output.
  • src/tui/components/operator-dashboard/logic.ts

    • Updates accumulateTokenUsage to accept an optional step-level totalTokens.
    • Keeps the previous fallback behavior when no explicit total is provided.
  • src/tui/components/operator-dashboard/index.tsx

    • Resets token usage when starting a brand-new operator session, so counters do not leak from a prior session.
    • Keeps resumed-session behavior separate: resumed sessions hydrate from persisted execution metrics.
    • Passes event.usage.totalTokens through both the local ref accumulator and the shared AgentProvider counter.
  • src/core/session/execution-metrics.ts

    • Adds preserveLargerTokenUsage to writeExecutionMetrics.
    • When enabled, persisted metrics keep the larger existing inputTokens, outputTokens, and totalTokens values instead of replacing them with smaller workflow-local totals.
    • This prevents cached/resumed workflow phases from writing zero or partial totals over a session that already has live UI usage recorded.
  • src/tui/components/operator-dashboard/logic.test.ts

    • Adds coverage for preserving SDK-reported totalTokens.
  • src/core/session/execution-metrics.test.ts

    • Adds coverage for the new execution-metric write behavior:
      • preserving larger token totals when requested
      • retaining default replace behavior when the preserve flag is not used

Together, these changes make workflow token usage follow the same accounting path as normal operator usage, reset cleanly for new sessions, hydrate correctly for resumed sessions, and avoid metric overwrites from workflow-local bookkeeping.

The attached screenshot shows /pentest running with populated footer counters for input, output, cached, and total tokens.

How did you verify your code works?

  • Ran focused tests for token accumulation and execution metric persistence:

    bun test src/tui/components/operator-dashboard/logic.test.ts src/core/session/execution-metrics.test.ts
    
  • Ran TS Checks

  • Ran the full test suite

Screenshot 2026-05-06 at 10 30 28 AM

ArshAnan and others added 4 commits May 6, 2026 10:09
… and cache metrics

- Added `onStepFinish` and `onCacheMetrics` callbacks to the `PentestSwarmInput` interface for better tracking of agent performance and resource usage.
- Updated related components to propagate these new callbacks, improving the overall observability of the pentesting process.
- Included new dependencies in `package.json` and `bun.lock` for enhanced functionality.

Made-with: Cursor
- Added `preserveLargerTokenUsage` option to `WriteExecutionMetricsInput` to allow for retaining larger token usage values across sessions.
- Implemented `preserveLargerTokenUsage` logic in `writeExecutionMetrics` to conditionally merge token usage.
- Updated `addTokenUsage` method in `AgentContextValue` to accept an optional total token count.
- Enhanced `accumulateTokenUsage` function to handle total tokens, ensuring accurate tracking during workflow execution.
- Adjusted `OperatorDashboard` to reset token usage on new sessions and accommodate the new total tokens feature.

Co-authored-by: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Input and output token counters don't populate when you use workflows

1 participant