Skip to content

Add blog post on OpenSearch Agent Health for AI observability#4088

Merged
kolchfa-aws merged 6 commits intoopensearch-project:mainfrom
goyamegh:blog/agent-health-observability
Mar 6, 2026
Merged

Add blog post on OpenSearch Agent Health for AI observability#4088
kolchfa-aws merged 6 commits intoopensearch-project:mainfrom
goyamegh:blog/agent-health-observability

Conversation

@goyamegh
Copy link
Contributor

@goyamegh goyamegh commented Feb 28, 2026

Summary

Adds blog post announcing the experimental launch of OpenSearch Agent Health — an open-source evaluation and observability framework for AI agents, available as a zero-install NPX tool.

Resolves #4085

Changes

  • Blog post: _posts/2026-02-28-opensearch-agent-health.md
    • Covers the three core challenges (reasoning gap, cost-latency spiral,
      evaluation paradox) and how Agent Health addresses each
    • Includes a step-by-step walkthrough: quick start with sample data,
      connecting your own agent, creating benchmarks, running evaluations, and
      iterating on results
    • CLI and UI workflows with code examples
  • Author profiles: _community_members/goyamegh.md,
    _community_members/thottan.md
  • Author photos: assets/media/community/members/goyamegh.jpg,
    thottan.jpg
  • Blog images (2): Before/after diagram, benchmark comparison
    screenshot

Checklist

  • Blog post follows frontmatter template (layout: post, authors,
    date, categories)
  • No H1 heading in post body (title from frontmatter)
  • Author profiles created in _community_members/ with author
    persona
  • Author photos added to assets/media/community/members/
  • Blog images in
    assets/media/blog-images/2026-02-28-opensearch-agent-health/
  • All image paths use absolute site-root paths (/assets/...)
  • External links verified (GitHub repo, RFC, OTel docs, Observability
    Stack)
  • meta_keywords / meta_description — to be filled by marketing team

Authors

@goyamegh @Thottan

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: Megha Goyal <goyamegh@amazon.com>
@github-actions
Copy link

Thank you for submitting a blog post!

The blog post review process is: Submit a PR -> (Optional) Peer review -> Doc review -> Marketing review -> Published.

@github-actions
Copy link

Hi @goyamegh,

It looks like you're adding a new blog post but don't have an issue mentioned. Please link this PR to an open issue using one of these keywords in the PR description:

  • Closes #issue-number
  • Fixes #issue-number
  • Resolves #issue-number

If an issue hasn't been created yet, please create one and then link it to this PR.

@kolchfa-aws kolchfa-aws self-assigned this Mar 3, 2026
@kolchfa-aws kolchfa-aws added the Doc review The blog is under doc review label Mar 3, 2026
@goyamegh
Copy link
Contributor Author

goyamegh commented Mar 3, 2026

@kolchfa-aws can you help review this ?

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>
@kolchfa-aws
Copy link
Collaborator

@goyamegh @pajuric Doc review complete. The blog is ready for marketing review/publishing.

@kolchfa-aws kolchfa-aws added Done and ready to publish The blog is approved and ready to publish and removed Doc review The blog is under doc review labels Mar 4, 2026
Signed-off-by: Rekha Thottan <rjthotan@amazon.com>
@Thottan Thottan force-pushed the blog/agent-health-observability branch from 74d6927 to 2d5cad4 Compare March 4, 2026 23:43
Signed-off-by: Rekha Thottan <rjthotan@amazon.com>
@Thottan
Copy link
Contributor

Thottan commented Mar 5, 2026

@kolchfa-aws, we have made changes to the Built for Developer Workflows and What's Next section. Could you please review these changes?

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>
@kolchfa-aws
Copy link
Collaborator

@Thottan Done


**3. Solving the evaluation paradox: Real-time agent evaluation**.

Agent Health uses the _golden path_ trajectory comparison, in which an LLM judge scores agent actions against expected outcomes. You define what _good_ looks like for your agent (the expected steps, tool calls, and outcomes) and Agent Health measures _how well_ your agent performs against these criteria. Using your preferred LLM provider as your judge gives you flexibility to choose the evaluation model that fits your needs and budget.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest rewrite to explain "golden path":
Agent Health uses the golden path trajectory comparison to evaluate agent performance. In this approach, you define the ideal sequence of steps, tool calls, and outcomes your agent should follow as the golden path. An LLM judge then scores your agent's actual behavior against that expected trajectory, flagging deviations that indicate errors or regressions. Agent Health measures how well your agent performs against these criteria, and using your preferred LLM provider as your judge gives you the flexibility to choose the evaluation model that fits your needs and budget.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pajuric, done

- community
meta_keywords: AI agents, agent observability, OpenTelemetry, LLM evaluation, agent tracing, AI agent testing, OpenSearch, agentic AI
meta_description: OpenSearch Agent Health provides open-source observability and evaluation for AI agents. Ship production-ready agents faster with real-time tracing, systematic benchmarking, and LLM-based evaluation.
---
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta_keywords: OpenSearch Agent Health, AI agents, observability, LLM evaluation, AI agent testing, OpenTelemetry, agentic AI, trace visualization, agent debugging, AI agent observability, LLM agent evaluation, agentic AI debugging, automated LLM benchmarking tool, open-source LLM observability

meta_description: Discover why AI agents fail in silence and how OpenSearch Agent Health solves it with open-source trace observability, automated benchmarking, and LLM judge evaluation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pajuric, done

…nation

Signed-off-by: Rekha Thottan <rjthotan@amazon.com>
@pajuric
Copy link

pajuric commented Mar 5, 2026

@kolchfa-aws kolchfa-aws merged commit c3b18c9 into opensearch-project:main Mar 6, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Done and ready to publish The blog is approved and ready to publish

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BLOG] OpenSearch Agent Health: Open-Source Observability and Evaluation for AI Agents

4 participants