Feature request: "Stop generating" — ability to stop run_async() from outside the agent
Problem
There is no supported way to stop an ongoing run_async() generation from outside the agent. This is a critical gap for any chat UI where users expect a "Stop generating" button.
Current state (ADK v1.25.1)
| Mechanism |
Can stop from outside? |
Stops mid-stream? |
Session stays consistent? |
InvocationContext.end_invocation |
No — only settable from callbacks/tools |
No — only checked between steps |
Yes |
break from async for |
Yes — Python async generator protocol |
Partially — waits for next yield |
Unknown — undocumented path |
asyncio.Task.cancel() |
Yes |
Partially — at next await |
Unknown — CancelledError only handled in MCP cleanup |
RunConfig.max_llm_calls |
No — safety limit, not real-time |
No |
Yes |
None of these are designed as a "stop generation" mechanism for the consumer of run_async().
What I'd like
A way to signal "stop" from the code consuming run_async(), so the agent stops generating as soon as possible:
# Option A: stop() on a run handle
run = runner.start_run(user_id="u1", session_id="s1", new_message=content)
async for event in run:
yield event
# From another coroutine (e.g., user presses "Stop generating"):
run.stop()
# Option B: pass a stop signal to run_async()
stop_signal = StopSignal()
async for event in runner.run_async(
user_id="u1",
session_id="s1",
new_message=content,
stop_signal=stop_signal,
):
yield event
# From another coroutine:
stop_signal.stop()
Expected behavior
- The agent should stop as soon as possible: between streaming chunks in SSE mode, before the next tool call, or before the next LLM call.
- Already-yielded events and their
state_delta should be persisted normally — the session must remain consistent for the next user message.
- A final event should be yielded to let the consumer know the generation was stopped (e.g.,
interrupted=True on the last event, or a dedicated stop event). The interrupted field already exists on LlmResponse for bidi streaming voice interruptions — the same concept applies here.
- After stopping, the next call to
run_async() on the same session should work normally.
Use case
We run an ADK agent behind a WebSocket in a chat UI. The agent uses SSE streaming (StreamingMode.SSE). When the user presses "Stop generating":
- The LLM should stop producing tokens
- No further tool calls should be executed
- The session should remain usable for the next message
This is standard behavior in any LLM-powered chat (ChatGPT, Gemini web, Claude) and is expected by users.
Current workaround
We break from the async for event in runner.run_async(...) loop. This triggers aclose() → GeneratorExit propagation through the Aclosing wrappers in the chain (Runner → BaseAgent → BaseLlmFlow → _run_one_step_async). It works in practice, but:
- It is not documented as a supported pattern
- There is no guarantee ADK handles
GeneratorExit gracefully in all code paths (mid-tool-execution, mid-auth-flow, mid-MCP-call)
- There is no final event emitted — the consumer cannot distinguish between "generation completed" and "generation was stopped"
- The session consistency after a forced generator close is not guaranteed
Additional context
end_invocation on InvocationContext is close to what's needed, but it's only accessible from inside the agent (callbacks/tools), not from the external consumer.
- Other frameworks provide this: LangGraph has cancellation tokens, OpenAI's API supports
abort() on streams, Anthropic's SDK supports controller.abort().
Feature request: "Stop generating" — ability to stop
run_async()from outside the agentProblem
There is no supported way to stop an ongoing
run_async()generation from outside the agent. This is a critical gap for any chat UI where users expect a "Stop generating" button.Current state (ADK v1.25.1)
InvocationContext.end_invocationbreakfromasync foryieldasyncio.Task.cancel()awaitCancelledErroronly handled in MCP cleanupRunConfig.max_llm_callsNone of these are designed as a "stop generation" mechanism for the consumer of
run_async().What I'd like
A way to signal "stop" from the code consuming
run_async(), so the agent stops generating as soon as possible:Expected behavior
state_deltashould be persisted normally — the session must remain consistent for the next user message.interrupted=Trueon the last event, or a dedicated stop event). Theinterruptedfield already exists onLlmResponsefor bidi streaming voice interruptions — the same concept applies here.run_async()on the same session should work normally.Use case
We run an ADK agent behind a WebSocket in a chat UI. The agent uses SSE streaming (
StreamingMode.SSE). When the user presses "Stop generating":This is standard behavior in any LLM-powered chat (ChatGPT, Gemini web, Claude) and is expected by users.
Current workaround
We
breakfrom theasync for event in runner.run_async(...)loop. This triggersaclose()→GeneratorExitpropagation through theAclosingwrappers in the chain (Runner→BaseAgent→BaseLlmFlow→_run_one_step_async). It works in practice, but:GeneratorExitgracefully in all code paths (mid-tool-execution, mid-auth-flow, mid-MCP-call)Additional context
end_invocationonInvocationContextis close to what's needed, but it's only accessible from inside the agent (callbacks/tools), not from the external consumer.abort()on streams, Anthropic's SDK supportscontroller.abort().