[DNM] OTEL: Instrument ctx.sample() / sampling loop#3892
[DNM] OTEL: Instrument ctx.sample() / sampling loop#3892strawgate wants to merge 3 commits intoPrefectHQ:mainfrom
Conversation
Wrap attribute-setting blocks in server_span, delegate_span, and client_span with `if span.is_recording():` to avoid unnecessary work on non-recording spans. Add error.type attribute using `type(e).__qualname__` for proper exception class identification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add OTEL tracing to the sampling pipeline: - `sampling/createMessage` span wrapping sample_impl() with attributes for temperature, max_tokens, tool_count, result_type, and iteration count - `sampling/createMessage step` child spans per loop iteration with iteration index and stop reason - `sampling.execute_tool <name>` spans for tool calls within sampling, including error status on failures - Validation failure and text response retry events on the parent span - Opt-in content capture via OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT Also adds is_recording() guards to server_span, delegate_span, and client_span to avoid unnecessary attribute serialization when sampling is disabled, and sets error.type using __qualname__ for better nested class names. Closes PrefectHQ#3891 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Thanks for the report. This issue goes beyond what our contributor guidelines ask for — we just need a short problem description and an MRE. Please see our contributing guidelines and condense this issue. We'll triage it once it's trimmed down. |
Test Failure AnalysisSummary: The Root Cause: Suggested Solution: Run ruff format locally and commit the result: uv run ruff format src/fastmcp/server/sampling/run.pyThen commit the reformatted file. Detailed AnalysisCI output: Diff applied by
# Before (fails format)
_CAPTURE_CONTENT = os.environ.get(
"OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT", "false"
).lower() == "true"
# After (passes format)
_CAPTURE_CONTENT = (
os.environ.get(
"OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT", "false"
).lower()
== "true"
)
Related Files
🤖 Triage by Marvin (updated to reflect latest run — ty errors resolved, ruff format still failing) |
🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
ctx.sample()/ sampling loop pipeline (sample_impl,sample_step_impl,execute_tools) withsampling/createMessage,sampling/createMessage step, andsampling.execute_toolspansOTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENTenv varis_recording()guards toserver_span,delegate_span, andclient_spanto skip attribute serialization on non-recording spanstype(e).__qualname__forerror.typeattribute for better nested class identificationCloses #3891
Test plan
tests/server/telemetry/test_sampling_tracing.pycovering:sampling/createMessagespan with correct attributes and iteration countsampling/createMessage stepchild spans per loop iterationsampling.execute_toolspans for tool calls within samplingruff check src/passes🤖 Generated with Claude Code