feat(import): add evaluator and online eval config import subcommands#780
Merged
jesseturner21 merged 18 commits intomainfrom Apr 7, 2026
Merged
feat(import): add evaluator and online eval config import subcommands#780jesseturner21 merged 18 commits intomainfrom
jesseturner21 merged 18 commits intomainfrom
Conversation
Add `agentcore import evaluator` to import existing AWS evaluators into CLI projects. Refactor import types and utilities for extensibility so future resource types require minimal new code. Changes: - Add import-evaluator.ts handler with toEvaluatorSpec mapping (LLM-as-a-Judge and code-based evaluators), duplicate detection, and CDK import pipeline - Enhance getEvaluator API wrapper to extract full evaluatorConfig (model, instructions, ratingScale) and tags from SDK tagged unions - Add listAllEvaluators pagination helper filtering out built-in evaluators - Widen ImportableResourceType union and shared utilities for evaluator support - Add evaluator to TUI import flow (select, ARN input, progress screens) - Add 17 unit tests covering spec conversion, template lookup, and error cases Tested end-to-end against real AWS evaluator (bugbash_eval_1775226567-zrDxm7Gpcw) with verified field mapping for all config fields, tags, and deployed state.
The TUI import wizard hardcoded importType as 'memory' for all non-runtime resources, causing evaluator imports to fail with "ARN resource type evaluator does not match expected type memory". Use flow.resourceType instead so the correct handler is dispatched.
Add `agentcore import online-eval` to import existing online evaluation
configs from AWS into CLI-managed projects. Follows the same pattern as
runtime, memory, and evaluator imports.
The command extracts the agent reference from the config's service names
(pattern: {agentName}.DEFAULT), maps evaluator IDs to local names or
ARN fallbacks, and runs the full CDK import pipeline.
Also removes incorrect project-prefix stripping from evaluator and
runtime imports — imported resources come from outside the project
and won't have the project prefix.
Constraint: Agent must exist in project runtimes[] before import (schema enforces cross-reference)
Constraint: Evaluators not in project fall back to ARN format to bypass schema validation
Rejected: Loose agent validation | schema writeProjectSpec() enforces runtimes[] cross-reference
Confidence: high
Scope-risk: moderate
Add 'Online Eval Config' option to the interactive import flow so users can import online evaluation configs via the TUI, not just the CLI. Follows the same ARN-only pattern as evaluator and memory imports: select type → enter ARN → import progress → success/error.
Screenshots captured from the TUI import flow showing: - Import type selection menu with Online Eval Config option - ARN input screen for online eval config - ARN input with a real config ARN filled in
This reverts commit cb4c675.
… pattern Reduce ~1,400 lines of duplicated orchestration across four import handlers (runtime, memory, evaluator, online-eval) to ~600 lines by extracting shared logic into executeResourceImport(). Each resource type now provides a thin descriptor declaring its specific behavior. Constraint: Public handleImport* function signatures unchanged (TUI depends on them) Constraint: Factory functions needed for runtime/online-eval to share mutable state between hooks Rejected: Strategy class hierarchy | descriptor objects are simpler and more composable Confidence: high Scope-risk: moderate
…-control Deduplicates identical pagination loops across 4 listAll* functions and identical tag-fetching try/catch blocks across 3 getDetail functions. Also adds optional client param to listEvaluators and listOnlineEvaluationConfigs for connection reuse during pagination. Addresses deferred review feedback from PR #763. Constraint: evaluator listAll still filters out Builtin.* entries Confidence: high Scope-risk: narrow
…rted evaluators resolveEvaluatorReferences used string-contains matching (evaluatorId.includes(localName)) which only works when the evaluator was deployed by the same project. Imported evaluators with renamed local names never matched, falling back to raw ARNs in the config. Now reads deployed-state.json to build an evaluatorId → localName reverse map and checks it first, before the string-contains heuristic. Constraint: Deployed state may not exist yet (first import) — .catch() handles gracefully Rejected: Passing deployed state through descriptor interface | only online-eval needs this Confidence: high Scope-risk: narrow
…ring import Evaluators referenced by ENABLED online eval configs are locked by the service (lockedForModification=true), causing CFN import to fail when it tries to apply stack-level tags. Now the evaluator import detects the lock, temporarily disables referencing online eval configs, performs the import, then re-enables them. Constraint: Re-enable runs in finally block so configs are restored on both success and failure Constraint: Only disables configs that actually reference this specific evaluator Rejected: Refuse import with manual guidance | user can't pause configs not yet in project Confidence: high Scope-risk: moderate
…ators during import" This reverts commit 5839391.
…se ARN-only references Evaluators locked by an online eval config cannot be CFN-imported because CloudFormation triggers a post-import TagResource call that the resource handler rejects. Instead of stripping tags from the import template, block the import with a clear error and suggestion to use import online-eval. Online eval config import now always references evaluators by ARN rather than resolving to local names, since the evaluators cannot be imported into the project alongside the config. Constraint: CFN IMPORT triggers TagResource which fails on locked evaluators Rejected: Strip Tags from import template | still fails on some resource types Confidence: high Scope-risk: narrow
…ime has custom name extractAgentName() derives the AWS runtime name from the OEC service name pattern, but this fails to match when the runtime was imported with --name since the project spec stores the local name. Now falls back to listing runtimes to find the runtime ID, then looks up the local name in deployed-state.json.
…lving agent
CDK constructs set the OEC service name as "{projectName}_{agentName}.DEFAULT".
extractAgentName() strips ".DEFAULT" but not the project prefix, so the
lookup fails against local runtime names. Now strips the prefix as a fast
path before falling back to the deployed-state API lookup.
getEvaluator() now catches ResourceNotFoundException and ValidationException from the SDK and rethrows a clear message instead of exposing the raw regex validation error.
import online-eval used a naive regex to extract the config ID from the ARN, skipping resource type, region, and account validation. Now uses parseAndValidateArn like all other import commands. Added an ARN resource type mapping to handle the online-eval vs online-evaluation-config mismatch between ImportableResourceType and the ARN format.
Contributor
Package TarballHow to installnpm install https://github.com/aws/agentcore-cli/releases/download/pr-780-tarball/aws-agentcore-0.6.0.tgz |
Contributor
Coverage Report
|
Hweinstock
previously approved these changes
Apr 7, 2026
Contributor
Hweinstock
left a comment
There was a problem hiding this comment.
Nice! I like the refactor, and I think it will be make it easier to add more import logic. Few nit comments, and one edge case that seems unlikely.
- Add `red` to ANSI constants, replace inline escape codes - Type GetEvaluatorResult.level as EvaluationLevel at boundary - Combine ARN_RESOURCE_TYPE_MAP, collectionKeyMap, idFieldMap into single RESOURCE_TYPE_CONFIG to prevent drift - Export IMPORTABLE_RESOURCES as const array, derive type from it, replace || chains with .includes() - Fix samplingPercentage === 0 false positive (use == null) - Document closure state sequencing contract on descriptor hooks
The test exercised a defensive fallback in toEvaluatorSpec for an empty level string, but now that GetEvaluatorResult.level is typed as EvaluationLevel, the boundary cast in getEvaluator prevents this case from ever reaching toEvaluatorSpec.
Hweinstock
approved these changes
Apr 7, 2026
Contributor
Hweinstock
left a comment
There was a problem hiding this comment.
LGTM! Thanks for addressing nits.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add
agentcore import evaluatorandagentcore import online-evalcommands to import existing AWS evaluators (LLM-as-a-Judge and code-based) and online evaluation configs into CLI projects. Includes TUI import wizard support (select screen, ARN input, progress tracking).Extracts a generic import orchestrator (
executeResourceImport) using a descriptor pattern, reducing ~1,400 lines of duplicated orchestration across import handlers to ~600 lines. Each resource type provides a thin descriptor declaring its specific behavior (AWS APIs, CFN types, spec conversion, hooks).New features
import-evaluator.tstoEvaluatorSpecmapping, evaluator descriptorimport-online-eval.tsagentcore-control.tsgetEvaluatorto extract full config; addlistAllEvaluators,listAllOnlineEvaluationConfigs,getOnlineEvaluationConfigimport-evaluator.test.tsimport-online-eval.test.tsRefactoring — generic import orchestrator
resource-import.tsexecuteResourceImport<TDetail, TSummary>()generic orchestrator owning the full 10-step import sequencetypes.tsResourceImportDescriptorinterface andBeforeWriteContexthook typeconstants.tsNAME_REGEXandANSIconstants (previously copy-pasted in each handler)import-memory.tsimport-evaluator.tsimport-online-eval.tsbeforeConfigWritestate sharing (~170 lines)import-runtime.tsbeforeConfigWrite+rollbackExtrahooks (~130 lines)command.ts,import-utils.tsRelated Issue
Documentation PR
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshotsAdditional E2E testing:
import evaluator— LLM-as-a-Judge with rating scale, tagsimport memory— 3 strategies (semantic, summarization, user_preference), tagsimport runtime— CodeZip Python 3.12, lifecycle config, execution role, source copyimport online-eval— agent reference resolved, evaluator local name resolved via deployed state, sampling 50%, enableOnCreateagentcore statusshows all imported resources with correct stateChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.