Skip to content

test(e2e): add A/B test and all-gateway-target lifecycle suites#1584

Open
jariy17 wants to merge 7 commits into
mainfrom
test/e2e-ab-test-and-gateway-targets
Open

test(e2e): add A/B test and all-gateway-target lifecycle suites#1584
jariy17 wants to merge 7 commits into
mainfrom
test/e2e-ab-test-and-gateway-targets

Conversation

@jariy17

@jariy17 jariy17 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

What

Adds two e2e suites covering gaps in the AWS-boundary coverage (per e2e-tests/README.md's targets list).

ab-test-lifecycle.test.ts

A/B test lifecycle in config-bundle mode:
create → gateway → config-bundle v1 → deploy → config-bundle v2 → deploy → online-eval (Builtin evaluator) → deploy → run ab-test → view (RUNNING) → pause (PAUSED) → resume (RUNNING) → promote (STOPPED).

Asserts the live execution state from AWS via view ab-test --json after each control-plane call (the execution status surfaces in lifecycleStatus). A/B tests are fire-and-forget jobs, not project resources, so cleanup archives the test explicitly before teardown.

httpgateway-all-targets.test.ts

One protocolType: None (HTTP) gateway — a superset that can host every target type — carrying all deployable gateway-target types in a single stack:

http-runtime, mcp-server, lambda-function-arn, api-gateway, open-api-schema, smithy-model, web-search.

Verifies config (agentcore.json), deployed-state.json, and status. Omitted: connector (private-beta FMKB CFN resource type) and passthrough (unreleased gated feature).

Fixtures

fixtures/gateway-targets/setup_target_prereqs.py idempotently provisions the external resources targets can't self-provision (a Lambda + a REST API), mirroring the existing fixtures/import/ boto3 pattern. Static schema fixtures (openapi.json, smithy.json, lambda-tools.json) are copied into the project at test time.

Notes

  • The api-key credential provider is deleted in afterAll — it's created by a pre-deploy SDK call (not CloudFormation), so it survives stack teardown and would otherwise leak.
  • The api-gateway REST API method sets an operationName (→ operationId), required by the AgentCore target; the Lambda gets a resource-based invoke policy for the gateway service.

Testing

Both suites pass end-to-end against AWS: 22/22 (10 ab-test + 12 gateway), with clean teardown and no leaked stacks or credential providers.

Add two e2e suites covering gaps in the AWS-boundary coverage:

- ab-test-lifecycle.test.ts: A/B test in config-bundle mode — create
  project + gateway + two config-bundle versions + online-eval, deploy,
  run ab-test, then pause/resume/promote asserting live execution state
  from AWS via `view ab-test`. Archives the test (a job, not a project
  resource) before teardown.

- httpgateway-all-targets.test.ts: one protocolType:None (HTTP) gateway
  hosting every deployable target type (http-runtime, mcp-server,
  lambda-function-arn, api-gateway, open-api-schema, smithy-model,
  web-search), deployed in a single stack. Omits connector (private-beta
  FMKB) and passthrough (unreleased gated feature).

External resources targets can't self-provision (Lambda, REST API) are
created idempotently by fixtures/gateway-targets/setup_target_prereqs.py,
mirroring the fixtures/import/ boto3 pattern. The api-key credential
provider is deleted in afterAll since it is created by a pre-deploy SDK
call, not CloudFormation, and so survives stack teardown.

Both suites pass against AWS (22/22).
@jariy17 jariy17 requested a review from a team June 18, 2026 16:41
@github-actions github-actions Bot added the size/l PR size: L label Jun 18, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.20.0.tgz

How to install

gh release download pr-1584-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.20.0.tgz

Comment thread e2e-tests/fixtures/gateway-targets/setup_target_prereqs.py Fixed
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 36.92% 13501 / 36559
🔵 Statements 36.21% 14356 / 39642
🔵 Functions 31.49% 2310 / 7335
🔵 Branches 30.69% 8917 / 29052
Generated in workflow #3716 for commit 6a6287a by the Vitest Coverage Report Action

@agentcore-cli-automation agentcore-cli-automation left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look well-structured overall — clean sequential flows, real AWS assertions, and no excessive mocking. Two main issues worth addressing before merge, both around the shared-resource design in setup_target_prereqs.py.

The e2e-tests-full.yml workflow runs a cli-build: [preview, ga] matrix in parallel against the same AWS account. Since the prereq script uses deterministic resource names (e2e-gwtarget-fn, e2e-gwtarget-api-v2) for idempotent reuse, both matrix jobs can race on the create paths. See inline comments for specifics.

print(f" role not ready, retrying ({attempt + 1})...")
time.sleep(5)
continue
raise

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition with parallel CI runs.

e2e-tests-full.yml runs cli-build: [preview, ga] in parallel against the same AWS account. If both matrix jobs reach ensure_lambda() before either has created the function, both will pass get_function (NotFound), both will call create_function, and one will fail with ResourceConflictException. The retry loop here only catches role-propagation errors ("cannot be assumed" / "InvalidParameterValue") — a conflict will be re-raised and fail the e2e setup.

Options:

  1. Catch lam.exceptions.ResourceConflictException (or check e.response['Error']['Code'] == 'ResourceConflictException') and re-fetch the function via get_function to recover the ARN.
  2. Suffix the resource name with something like the GitHub run ID / matrix value so the two jobs don't share resources (matches the import-resources.test.ts fixture pattern, but loses the cross-run reuse).
  3. Serialize the two matrix jobs (e.g. use a workflow-level concurrency group) so they don't run simultaneously.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e5980bd (option 1). ensure_lambda now catches ResourceConflictException from a concurrent create_function and recovers the ARN via get_function, so the loser of the race reuses the winner's function instead of failing. Cross-run reuse is preserved.

for item in api.get_rest_apis(limit=500).get("items", []):
if item.get("name") == REST_API_NAME:
print(f"REST API exists: {item['id']}")
return item["id"], REST_API_STAGE

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same race produces duplicate REST APIs (silent leak).

Unlike Lambda, API Gateway happily lets you create multiple REST APIs with the same name. If two parallel CI jobs both find no API matching REST_API_NAME and both call create_rest_api, you end up with two REST APIs named e2e-gwtarget-api-v2 in the account. Subsequent runs will pick whichever one get_rest_apis returns first — so this won't necessarily fail tests, but it leaks duplicates over time and the count grows without bound.

Options (same as for Lambda, pick one consistently):

  1. Suffix the API name with a CI-job-stable identifier so each parallel runner gets its own.
  2. Serialize the matrix jobs (workflow concurrency group).
  3. Detect duplicates here and clean them up before returning — uglier than the other two.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e5980bd. ensure_rest_api now reconciles to the lowest-id API and deletes any duplicates — both before the create path (reuse) and after it (in case two jobs both created one) — so parallel jobs converge on a single API and duplicates never accumulate, while keeping idempotent cross-run reuse. Verified against AWS: after a run, exactly one e2e-gwtarget-api-v2 remains.

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 18, 2026
…racefully

- ab-test-target-based.test.ts: A/B test in target-based mode — two
  http-runtime gateway-targets on named runtime endpoints (control,
  treatment), each scoped by its own online-eval. run → pause → resume →
  promote (promote version-bumps the control endpoint to treatment's).
  Complements the existing config-bundle-mode suite. Passes AWS 11/11.

- setup_target_prereqs.py + httpgateway-all-targets.test.ts: catch
  AccessDenied per resource and emit null so a restricted IAM role (e.g.
  the e2e-github-actions CI role, which lacks lambda:*/apigateway:*)
  skips the lambda-function-arn / api-gateway targets instead of failing
  the whole suite.
@github-actions github-actions Bot added size/xl PR size: XL and removed size/l PR size: L labels Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
Mirror the ab-test-target-based.test.ts naming now that both A/B test
modes have their own suite; the old "lifecycle" name was ambiguous.
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
Add the passthrough target type (gated behind ENABLE_GATED_FEATURES, with
gateway-iam-role outbound auth + signing service). add/deploy now run with
the gate enabled. The CDK fully synthesizes passthrough targets, so it
deploys in the single stack alongside the other 7 types. Suite now 13/13
against AWS. Only connector (private-beta FMKB) remains omitted.
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
Addresses PR review on the preview/ga matrix racing on shared deterministic
resource names:

- Remove unused `import sys`.
- ensure_lambda: catch ResourceConflictException from a concurrent
  create_function and recover the ARN via get_function instead of failing.
- ensure_rest_api: API Gateway allows duplicate names, so parallel jobs
  could leak duplicate REST APIs. Reconcile to the lowest-id API (delete the
  rest) both before and after the create path, so runs converge on one API
  and duplicates never accumulate — while keeping cross-run reuse.
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
print(f"Deleting duplicate REST API created by a parallel job: {dupe}")
try:
api.delete_rest_api(restApiId=dupe)
except api.exceptions.ClientError:
Previously `abTestId` was assigned on the last line of the `run ab-test`
retry callback, after the mode/variants assertions. If `run ab-test`
created the test (exit 0) but a later assertion in that block threw,
abTestId stayed unset and afterAll skipped `archive ab-test` — leaking the
test (which global-setup does not reap). The retry wrapper could also
re-create it up to 3×.

Capture abTestId immediately after asserting json.id (inside retry, before
any later throw), and move the deterministic mode/variants checks outside
the retry so a mismatch can't trigger a re-create. Now any post-create
failure still archives the test in afterAll. Both suites still pass (21/21).
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
The CI e2e job failed: under the restricted e2e-github-actions role,
ensure_lambda raised AccessDeniedException as expected, but _try()'s
`except botocore.exceptions.ClientError` threw `NameError: name 'botocore'
is not defined` because only boto3 was imported — crashing the fixture
instead of gracefully skipping the lambda/api-gateway targets.

Local runs missed this because the Admin role never triggers AccessDenied,
so the except clause never executed. Verified the fix by running the
fixture under the actual e2e-github-actions role: both targets now skip
cleanly (null) and the script exits 0.
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels Jun 18, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
@agentcore-devx-automation

Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/xl PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants