Add blog post on OpenSearch Agent Health for AI observability by goyamegh · Pull Request #4088 · opensearch-project/project-website

goyamegh · 2026-02-28T23:54:54Z

Summary

Adds blog post announcing the experimental launch of OpenSearch Agent Health — an open-source evaluation and observability framework for AI agents, available as a zero-install NPX tool.

Resolves #4085

Changes

Blog post: _posts/2026-02-28-opensearch-agent-health.md
- Covers the three core challenges (reasoning gap, cost-latency spiral,
  evaluation paradox) and how Agent Health addresses each
- Includes a step-by-step walkthrough: quick start with sample data,
  connecting your own agent, creating benchmarks, running evaluations, and
  iterating on results
- CLI and UI workflows with code examples
Author profiles: _community_members/goyamegh.md,
_community_members/thottan.md
Author photos: assets/media/community/members/goyamegh.jpg,
thottan.jpg
Blog images (2): Before/after diagram, benchmark comparison
screenshot

Checklist

Blog post follows frontmatter template (layout: post, authors,
date, categories)
No H1 heading in post body (title from frontmatter)
Author profiles created in _community_members/ with author
persona
Author photos added to assets/media/community/members/
Blog images in
assets/media/blog-images/2026-02-28-opensearch-agent-health/
All image paths use absolute site-root paths (/assets/...)
External links verified (GitHub repo, RFC, OTel docs, Observability
Stack)
meta_keywords / meta_description — to be filled by marketing team

Authors

@goyamegh @Thottan

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: Megha Goyal <goyamegh@amazon.com>

github-actions · 2026-02-28T23:55:02Z

Thank you for submitting a blog post!

The blog post review process is: Submit a PR -> (Optional) Peer review -> Doc review -> Marketing review -> Published.

github-actions · 2026-02-28T23:55:03Z

Hi @goyamegh,

It looks like you're adding a new blog post but don't have an issue mentioned. Please link this PR to an open issue using one of these keywords in the PR description:

Closes #issue-number
Fixes #issue-number
Resolves #issue-number

If an issue hasn't been created yet, please create one and then link it to this PR.

goyamegh · 2026-03-03T18:00:17Z

@kolchfa-aws can you help review this ?

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

kolchfa-aws · 2026-03-04T15:30:16Z

@goyamegh @pajuric Doc review complete. The blog is ready for marketing review/publishing.

Signed-off-by: Rekha Thottan <rjthotan@amazon.com>

Thottan · 2026-03-05T09:33:37Z

@kolchfa-aws, we have made changes to the Built for Developer Workflows and What's Next section. Could you please review these changes?

_posts/2026-02-28-opensearch-agent-health.md

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

kolchfa-aws · 2026-03-05T12:44:47Z

@Thottan Done

pajuric · 2026-03-05T14:46:07Z

_posts/2026-02-28-opensearch-agent-health.md

+
+**3. Solving the evaluation paradox: Real-time agent evaluation**.
+
+  Agent Health uses the _golden path_ trajectory comparison, in which an LLM judge scores agent actions against expected outcomes. You define what _good_ looks like for your agent (the expected steps, tool calls, and outcomes) and Agent Health measures _how well_ your agent performs against these criteria. Using your preferred LLM provider as your judge gives you flexibility to choose the evaluation model that fits your needs and budget.


Suggest rewrite to explain "golden path":
Agent Health uses the golden path trajectory comparison to evaluate agent performance. In this approach, you define the ideal sequence of steps, tool calls, and outcomes your agent should follow as the golden path. An LLM judge then scores your agent's actual behavior against that expected trajectory, flagging deviations that indicate errors or regressions. Agent Health measures how well your agent performs against these criteria, and using your preferred LLM provider as your judge gives you the flexibility to choose the evaluation model that fits your needs and budget.

@pajuric, done

pajuric · 2026-03-05T15:29:30Z

_posts/2026-02-28-opensearch-agent-health.md

+  - community
+meta_keywords: AI agents, agent observability, OpenTelemetry, LLM evaluation, agent tracing, AI agent testing, OpenSearch, agentic AI
+meta_description: OpenSearch Agent Health provides open-source observability and evaluation for AI agents. Ship production-ready agents faster with real-time tracing, systematic benchmarking, and LLM-based evaluation.
+---


meta_keywords: OpenSearch Agent Health, AI agents, observability, LLM evaluation, AI agent testing, OpenTelemetry, agentic AI, trace visualization, agent debugging, AI agent observability, LLM agent evaluation, agentic AI debugging, automated LLM benchmarking tool, open-source LLM observability

meta_description: Discover why AI agents fail in silence and how OpenSearch Agent Health solves it with open-source trace observability, automated benchmarking, and LLM judge evaluation.

@pajuric, done

…nation Signed-off-by: Rekha Thottan <rjthotan@amazon.com>

pajuric · 2026-03-05T22:19:29Z

@kolchfa-aws - Please final merge and close. The blog is published here: https://opensearch.org/blog/opensearch-agent-health-open-source-observability-and-evaluation-for-ai-agents/

Add blog post on OpenSearch Agent Health for AI observability

6f7f4a0

Signed-off-by: Megha Goyal <goyamegh@amazon.com>

goyamegh requested review from AMoo-Miki, CEHENKLE, elfisher, kolchfa-aws, krisfreedain, natebower, nateynateynate, nknize and peterzhuamazon as code owners February 28, 2026 23:54

kolchfa-aws self-assigned this Mar 3, 2026

kolchfa-aws added the Doc review The blog is under doc review label Mar 3, 2026

Doc review

ce676ae

Signed-off-by: Fanit Kolchina <kolchfa@amazon.com>

kolchfa-aws added Done and ready to publish The blog is approved and ready to publish and removed Doc review The blog is under doc review labels Mar 4, 2026

kolchfa-aws assigned pajuric Mar 4, 2026

Update section Built for Developer Workflows

2d5cad4

Signed-off-by: Rekha Thottan <rjthotan@amazon.com>

Thottan force-pushed the blog/agent-health-observability branch from 74d6927 to 2d5cad4 Compare March 4, 2026 23:43

Add Agentic AI Eval Platform RFC reference to What's next section

3ece44f

Signed-off-by: Rekha Thottan <rjthotan@amazon.com>

kolchfa-aws reviewed Mar 5, 2026

View reviewed changes

_posts/2026-02-28-opensearch-agent-health.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Mar 5, 2026

View reviewed changes

_posts/2026-02-28-opensearch-agent-health.md Outdated Show resolved Hide resolved

Apply suggestions from code review

57a838f

Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com>

pajuric reviewed Mar 5, 2026

View reviewed changes

Update meta keywords/description and improve evaluation paradox expla…

3cc0069

…nation Signed-off-by: Rekha Thottan <rjthotan@amazon.com>

kolchfa-aws approved these changes Mar 6, 2026

View reviewed changes

kolchfa-aws merged commit c3b18c9 into opensearch-project:main Mar 6, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add blog post on OpenSearch Agent Health for AI observability#4088

Add blog post on OpenSearch Agent Health for AI observability#4088
kolchfa-aws merged 6 commits intoopensearch-project:mainfrom
goyamegh:blog/agent-health-observability

goyamegh commented Feb 28, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 28, 2026

Uh oh!

github-actions bot commented Feb 28, 2026

Uh oh!

goyamegh commented Mar 3, 2026

Uh oh!

kolchfa-aws commented Mar 4, 2026

Uh oh!

Thottan commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

kolchfa-aws commented Mar 5, 2026

Uh oh!

pajuric Mar 5, 2026

Uh oh!

Thottan Mar 5, 2026

Uh oh!

pajuric Mar 5, 2026

Uh oh!

Thottan Mar 5, 2026

Uh oh!

pajuric commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		3. Solving the evaluation paradox: Real-time agent evaluation.

		Agent Health uses the _golden path_ trajectory comparison, in which an LLM judge scores agent actions against expected outcomes. You define what _good_ looks like for your agent (the expected steps, tool calls, and outcomes) and Agent Health measures _how well_ your agent performs against these criteria. Using your preferred LLM provider as your judge gives you flexibility to choose the evaluation model that fits your needs and budget.

Conversation

goyamegh commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Checklist

Authors

Uh oh!

github-actions bot commented Feb 28, 2026

Uh oh!

github-actions bot commented Feb 28, 2026

Uh oh!

goyamegh commented Mar 3, 2026

Uh oh!

kolchfa-aws commented Mar 4, 2026

Uh oh!

Thottan commented Mar 5, 2026

Uh oh!

Uh oh!

Uh oh!

kolchfa-aws commented Mar 5, 2026

Uh oh!

pajuric Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Thottan Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

pajuric Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Thottan Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

pajuric commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

goyamegh commented Feb 28, 2026 •

edited

Loading