Skip to content

[FEATURE]: LLMObs decorators should serialize Pydantic model outputs as readable JSON #16864

@VivekChittamuri

Description

@VivekChittamuri

Describe the goal of the feature

This issue covers two related gaps in LLMObs output serialization:

1. Pydantic model outputs render as repr() instead of JSON (@workflow, @task, and others) When a decorated function returns a tuple or list containing Pydantic v2 BaseModel instances, safe_json() handles the top-level object but not nested models — causing _unserializable_default_repr to fall back to str(), producing unreadable repr() output in traces.

2. @llm decorated functions produce no output in traces at all Unlike @workflow and @task, the @llm decorator does not call LLMObs.annotate() on the return value — outputs are silently dropped and never appear in traces regardless of the return type.

Notes

  • PR fix(llmobs): serialize inputs and outputs to valid json #12416 is related to item 1 above — this request is a natural follow-on targeting the default= handler and @llm output path it left unaddressed.
  • Addressing both together would bring the full decorator suite to a consistent level of output observability.

Is your feature request related to a problem?

Yes. Our organization is trying to leverage LLM traces in datadog. And unfortunately without readability of trace inputs and outputs on large payloads, the debugging/triaging becomes extremely difficult. So wanted to reach out for assistance.

Describe alternatives you've considered

We considered manual LLMObs.annotate calls instead of decorators.. But it is hard to enforce its usage and also defeats the purpose of decorators at first place.

Additional context

Affected Versions

  • ddtrace >= 3.0 with LLMObs enabled (verified against 4.x)

Steps to Reproduce

from pydantic import BaseModel
from ddtrace.llmobs.decorators import workflow
from enum import Enum

class RoastLevel(str, Enum):
    LIGHT = "light"
    MEDIUM = "medium"
    DARK = "dark"

class CoffeeRating(BaseModel):
    bean_origin: str
    roast: RoastLevel
    flavor_notes: list[str]
    score: float

class BrewSummary(BaseModel):
    method: str
    brew_time_seconds: int
    notes: str

@workflow
async def evaluate_coffee_batch(inputs) -> tuple[list[CoffeeRating], list[BrewSummary]]:
    ratings = [
        CoffeeRating(bean_origin="Ethiopia", roast=RoastLevel.LIGHT, flavor_notes=["blueberry", "jasmine", "citrus"], score=9.2),
        CoffeeRating(bean_origin="Colombia", roast=RoastLevel.MEDIUM, flavor_notes=["caramel", "walnut", "dark chocolate"], score=8.7),
        CoffeeRating(bean_origin="Guatemala", roast=RoastLevel.DARK, flavor_notes=["smoky", "molasses", "cedar", "dark fruit"], score=7.1),
    ]
    summaries = [
        BrewSummary(method="pour-over", brew_time_seconds=240, notes="bright and clean"),
        BrewSummary(method="french-press", brew_time_seconds=480, notes="full body, low acidity"),
        BrewSummary(method="espresso", brew_time_seconds=30, notes="rich crema, intense"),
    ]
    return ratings, summaries
Actual trace output — hard to read
([CoffeeRating(bean_origin='Ethiopia', roast=<RoastLevel.LIGHT: 'light'>, flavor_notes=['blueberry',
'jasmine', 'citrus'], score=9.2), CoffeeRating(bean_origin='Colombia', roast=<RoastLevel.MEDIUM:
'medium'>, flavor_notes=['caramel', 'walnut', 'dark chocolate'], score=8.7), CoffeeRating(bean_origin=
'Guatemala', roast=<RoastLevel.DARK: 'dark'>, flavor_notes=['smoky', 'molasses', 'cedar', 'dark fruit',
'bittersweet cocoa'], score=7.1), CoffeeRating(bean_origin='Kenya', roast=<RoastLevel.LIGHT: 'light'>,
flavor_notes=['blackcurrant', 'tomato', 'grapefruit zest', 'raw sugar'], score=8.9), ...],
[BrewSummary(method='pour-over', brew_time_seconds=240, notes='bright and clean'), ...])
Expected output:
[
  [
    {"bean_origin": "Ethiopia", "roast": "light", "flavor_notes": ["blueberry", "jasmine", "citrus"], "score": 9.2},
    {"bean_origin": "Colombia", "roast": "medium", "flavor_notes": ["caramel", "walnut", "dark chocolate"], "score": 8.7},
    {"bean_origin": "Guatemala", "roast": "dark", "flavor_notes": ["smoky", "molasses", "cedar", "dark fruit"], "score": 7.1}
  ],
  [
    {"method": "pour-over", "brew_time_seconds": 240, "notes": "bright and clean"},
    {"method": "french-press", "brew_time_seconds": 480, "notes": "full body, low acidity"},
    {"method": "espresso", "brew_time_seconds": 30, "notes": "rich crema, intense"}
  ]
]

Potential Root Causes
  1. For @llm decorator, it never got the serialization capabilities in PR fix(llmobs): serialize inputs and outputs to valid json #12416 ... So we would need serialization added
  2. For all the other decorators, the issue with Pydantic models is:

safe_json() checks model_dump on the top-level object only. When the return type is tuple[list[BaseModel], ...], json.dumps recurses into the list items, hits a nested Pydantic model, and delegates to the default= handler — _unserializable_default_repr

(ddtrace/llmobs/_utils.py, lines 287–291):

  def _unserializable_default_repr(obj):
      try:
          return str(obj)   # ← Pydantic model lands here → repr() in trace
      except Exception:
          log.warning("I/O object is neither JSON serializable nor string-able. Defaulting to placeholder value instead.")
          return "[Unserializable object: {}]".format(repr(obj))

Potential Approach/Fix (untested)
  def _unserializable_default_repr(obj):
    try:
        if hasattr(obj, "model_dump") and callable(obj.model_dump):
            # Pydantic v2 — fully JSON-native, no further default= calls needed : 
            # https://docs.pydantic.dev/latest/api/base_model/#pydantic.BaseModel.model_dump
            return obj.model_dump(mode="json")  
         if hasattr(obj, "dict") and callable(obj.dict):
            return obj.dict()   # Pydantic v1 — no mode="json" equivalent
    except Exception:
        pass  # model_dump failed, fall through to str()
    try:
        return str(obj)
    except Exception:
        log.warning("I/O object is neither JSON serializable nor string-able. Defaulting to placeholder value instead.")
        return "[Unserializable object: {}]".format(repr(obj))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions