Skip to content

feat: evals page — per-domain dashboard with trend lines and curation recommendations #32

Description

@rajnavakoti

Description

Once multi-domain and run comparison are in place, the Evals page should become a per-domain health dashboard that tracks coverage over time and recommends what to curate next. This is the north-star vision for the eval surface.

Acceptance Criteria

  • Per-domain coverage dashboard with trend lines (coverage % over time)
  • "What to curate next" recommendations driven by eval gap analysis (highest-impact MISSING/INCOMPLETE topics)
  • Coverage threshold configuration per domain (e.g., "cash-payments needs 80% CLEAN to be operationally useful")
  • Export eval report as shareable artifact (PDF or markdown)
  • Curation effort estimation based on gap analysis (how many entities/hours to reach target coverage)

Out of Scope

  • Automated curation (human reviews and approves all changes)
  • Integration with external task management (Jira, Linear)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions