Skip to content

[BUG]: langgraph integration marks GraphInterrupt as error (should be treated as control flow like ParentCommand) #16614

@nickcatal

Description

@nickcatal

Tracer Version(s)

4.4.0

Python Version(s)

3.13

Pip Version(s)

uv 0.7.2

Bug Report

The langgraph integration correctly treats ParentCommand as a control-flow exception (skipping span.set_exc_info()), but does not do the same for GraphInterrupt. This causes all LangGraph human-in-the-loop interrupt points to appear as errors in APM traces.

GraphInterrupt is the core mechanism for human-in-the-loop workflows in LangGraph. Calling interrupt() inside a node raises GraphInterrupt to pause execution, save state to the checkpointer, and return a payload to the caller. The graph is later resumed with Command(resume=...). This is normal control flow — not an error.

Current behavior in ddtrace/contrib/internal/langgraph/patch.py:

The exception handling checks for ParentCommand only:

if LangGraphParentCommandError is None or not isinstance(e, LangGraphParentCommandError):
    span.set_exc_info(*sys.exc_info())

When GraphInterrupt is raised, it falls through to span.set_exc_info() and the span is marked as errored.

Expected behavior:

GraphInterrupt should be handled identically to ParentCommand — the span should not be marked as an error when GraphInterrupt is raised.

Suggested fix:

Import GraphInterrupt alongside ParentCommand and add it to the isinstance check:

try:
    from langgraph.errors import GraphInterrupt as LangGraphGraphInterrupt
except ImportError:
    LangGraphGraphInterrupt = None

# Then in each traced function's except block:
if (
    (LangGraphParentCommandError is None or not isinstance(e, LangGraphParentCommandError))
    and (LangGraphGraphInterrupt is None or not isinstance(e, LangGraphGraphInterrupt))
):
    span.set_exc_info(*sys.exc_info())

Workaround:

We're currently using a TraceFilter to clear the error flag after the fact:

from ddtrace import tracer
from ddtrace.trace import TraceFilter

class _IgnoreGraphInterrupt(TraceFilter):
    def process_trace(self, trace):
        for span in trace:
            if span.get_tag("error.type") == "langgraph.errors.GraphInterrupt":
                span.set_tag("error", 0)
        return trace

tracer.configure(trace_processors=[_IgnoreGraphInterrupt()])

Reproduction Code

from langgraph.graph import StateGraph
from langgraph.types import interrupt

def node(state):
    # Standard human-in-the-loop pattern
    # See: https://docs.langchain.com/oss/python/langgraph/interrupts
    answer = interrupt({"question": "What are your symptoms?"})
    return {"messages": [answer]}

# When this graph runs and hits the interrupt, the resulting
# GraphInterrupt exception is marked as an error in the trace

Error Logs

Traces in Datadog APM show langgraph.errors.GraphInterrupt as an error on every interrupt point, even though this is normal control flow for human-in-the-loop workflows.

Libraries in Use

ddtrace==4.4.0
langgraph==0.3.34

Operating System

macOS (Darwin 25.2.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions