refactor: split engine serve bootstrap from CLI by tonyxiao · Pull Request #322 · stripe/sync-engine

tonyxiao · 2026-04-18T20:05:02Z

Summary

split the engine into dedicated sync-engine and sync-engine-serve binaries with shared bootstrap and shared API server startup
make apps/engine/src/api/index.ts side-effect-free and route Docker/dev/test callsites to the new dist/bin/serve.js and src/bin/* entrypoints
keep dynamic connector discovery on the full CLI path while forcing sync-engine-serve onto bundled-only connectors

Test plan

pnpm build
pnpm lint
pnpm exec vitest run src/api/index.test.ts src/__tests__/bin-serve.test.ts
node apps/engine/dist/bin/sync-engine.js --help
node apps/engine/dist/bin/serve.js on ports 3000 and 4000, verified with /health
pnpm --filter @stripe/sync-engine test (fails in this environment because src/__tests__/sync.test.ts requires Docker and the Docker daemon is unavailable)

The plugin loading test installs from local tarballs, not GitHub Packages. The global STRIPE_NPM_REGISTRY env causes .npmrc to redirect @stripe/* to GitHub Packages, breaking resolution. Override to npmjs.org so the committed .npmrc is effectively a no-op for this test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

…uild - e2e-plugin-loading.sh: write blank .npmrc + unset STRIPE_NPM_REGISTRY in temp dir so local tarballs resolve without hitting GitHub Packages - docker-image.yml: fall back to local Docker build when ghcr.io image not found (workflow_dispatch may run before build.yml) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

…ilure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Replace all Commander usage with citty across packages/ts-cli, packages/protocol, apps/sync-engine, apps/stateless, and apps/stateful. - `createCliFromSpec` now returns `CommandDef` with `meta`/`rootArgs` options - `createConnectorCli` returns `CommandDef` with typed subcommands - `--no-state` / `--no-connectors-from-path` → `noState` / `noConnectorsFromPath` boolean args (citty doesn't support Commander-style negation syntax) - Default serve action moved to root `run()` handler (replaces `isDefault: true`) - All tests updated: `parseAsync` → `runCommand`, Commander introspection APIs replaced with citty's `subCommands`/`meta`/`args` properties Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

bin entries now point at source (./src/*.ts) for dev, with publishConfig.bin pointing at dist for npm publish. This fixes pnpm install warnings about missing dist/ files before build. Affected: source-stripe, destination-postgres, destination-google-sheets, sync-engine, stateless, stateful. Also: E2E Docker test now fails hard if ghcr.io image is missing instead of silently building locally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

publishConfig.bin is pnpm-only — npm ignores it and drops .ts bin entries as invalid, leaving published packages with no bin at all. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

The npm job now packs each @stripe/* package from GitHub Packages and republishes the tarballs to npmjs.org. No checkout, no install, no build — just artifact promotion between registries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Sparse-checkout packages/ and apps/ at the target SHA, then scan package.json files to find publishable (non-private) @stripe/* names. No hardcoded package list. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Query /orgs/{owner}/packages?package_type=npm to find all npm packages linked to this repo. Zero checkout, zero build — pure registry-to-registry artifact promotion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

- Replace `pnpm -r publish || true` (silently swallowed all errors) with an explicit script that skips already-published versions gracefully but fails hard on real errors (auth, network, malformed package) - Delete outdated scripts/release-package.sh (superseded by release.yml) - Update scripts/README.md to reference the new script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

…engine - Delete packages/stateful-sync (@stripe/sync-lib-stateful) and apps/stateful (@stripe/sync-engine-stateful) — multi-tenant machinery not needed for single-tenancy - Merge packages/stateless-sync (@stripe/sync-lib-stateless) into apps/engine/src/lib/ (engine, pipeline, ndjson, resolver, exec helpers) - Merge apps/stateless (@stripe/sync-engine-stateless) API + CLI into apps/engine/src/ (api/app.ts, cli/index.ts, version.ts) - Move apps/sync-engine to apps/engine - Rename packages/store-postgres to packages/state-postgres (@stripe/sync-state-postgres) - Add StateStore interface + file/memory implementations (apps/engine/src/lib/state-store.ts) - Update all imports across tests, supabase, docs, and scripts - Delete stateful integration test and webhook e2e test (relied on deleted StatefulSync class) Before: 13 workspace packages (protocol, stateless-sync, stateful-sync, store-postgres, util-postgres, ts-cli, source-stripe, destination-postgres, destination-google-sheets, sync-engine, stateless, stateful, supabase) After: 9 workspace packages (protocol, sync-engine, state-postgres, util-postgres, ts-cli, source-stripe, destination-postgres, destination-google-sheets, supabase) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

…REGISTRY Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Moves the 35-line inline script out of release.yml into a standalone script. The workflow now sparse-checkouts just the script and runs it. Testable locally: GITHUB_TOKEN=... NPM_TOKEN=... bash scripts/promote-to-npmjs.sh Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Both release promote steps now delegate to scripts. Use jq for JSON parsing in promote-to-npmjs.sh (jq is pre-installed on GitHub runners). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

…from package.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Refactor Temporal worker from 8 fine-grained activities (healthCheck, sourceSetup, destinationSetup, backfillPage, writeBatch, processEvent, sourceTeardown, destinationTeardown) to 3 that match the stateless API surface: setup, sync, teardown. The sync activity calls /run which handles the full read→write pipeline, with NDJSON streaming and heartbeats. Add DEFAULT_WORKFLOW_ID env var to webhook-bridge so it can signal a known workflow directly, bypassing account-based routing (needed for non-Connect Stripe test keys that produce events without account field). Add tests/e2e-temporal-webhook.sh — a worker-agnostic shell script that tests the full production topology: stripe listen → webhook bridge → Temporal workflow → stateless API → Postgres. Verifies both backfill and live webhook event processing end-to-end. Fix e2e test paths from deleted apps/stateless to apps/engine. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

The engine calls it `state` everywhere (SyncParams.state, StateMessage). The Temporal layer was using `cursors` for the same concept, creating a gratuitous naming mismatch. Align to `state` throughout: SyncConfig.state, SyncResult.state, WorkflowStatus.state, and all workflow/activity code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

- Change source shebangs from #!/usr/bin/env tsx → #!/usr/bin/env node. tsc 5.9 preserves shebangs verbatim, so dist output is correct without post-processing. - Delete fix-shebangs.mjs (no longer needed) - Delete add-js-extensions.mjs (one-time codemod, already run) - Delete link-bins.sh (obsoleted by package.json bin entries) - Delete scripts/README.md (stale, references moved files) - Move d2.mjs to docs/ (diagram utility, not a build script) - Remove fix-shebangs from all package.json build scripts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Restores the stateful sync layer (deleted in consolidation) as a new @stripe/sync-service package at apps/service/. Includes: - Credential CRUD and sync config CRUD with referential integrity - SyncService class with push_event webhook fan-out to N syncs - Auth retry with credential refresh (up to 2 retries) - Shared-resource teardown logic (only removes webhook if no other syncs share the credential) - 17 OpenAPI routes via Hono (health, credentials, syncs, sync ops, webhooks) - File-system store implementations (credentials.json, syncs.json, state.json, logs.ndjson) - Static Zod schemas with .passthrough() instead of dynamic buildSchemas() - Pure exports from api/ and cli/ — only bin/ scripts have side effects - 8 passing tests covering CRUD, referential integrity, and webhook ingress Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

The engine's full read→write pipeline endpoint is a sync operation, not a generic "run". Aligns the route name with the Temporal activity name and the domain language. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Each store now uses a directory with one JSON file per entity (e.g., credentials/cred_abc.json) instead of a single monolithic JSON file. Adds 16 dedicated unit tests for all four FS store implementations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

- Extract CLI definition into src/cli/command.ts (pure export, no side effects) - Slim src/cli/index.ts down to a 4-line bin wrapper calling runMain() - Replace './app' with './api', drop './source-test' and './destination-test' - Update conformance test to import test connectors statically from '.' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

… e2e - Move 6 demo/utility scripts from tests/ → demo/ (read-from-stripe, write-to-postgres, write-to-sheets, stripe-to-postgres, stripe-to-google-sheets, reset-postgres) - Delete tests/e2e-temporal-webhook.sh (Temporal not deployed) - Delete tests/integration/ (empty — no test files, just config) - Remove integration test step from e2e workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Zod v4 strips unknown keys by default; .catchall(z.unknown()) is the replacement for .passthrough() to preserve connector-specific fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Made-with: Cursor Committed-By-Agent: cursor

publish_npm → publish_github_registry (Publish to GitHub Registry) publish_npmjs → promote_to_npm_registry (Promote to npm Registry) Made-with: Cursor Committed-By-Agent: cursor

Made-with: Cursor Committed-By-Agent: cursor

* feat: hard time limit cutoff + client disconnect cancellation Two-phase time limit in takeLimits: soft deadline (1s before) for graceful return, hard deadline (1s after) via Promise.race to force-cut blocked operations. AbortSignal threaded from HTTP handler through engine, source connectors, and all the way to individual fetch() calls. - EofPayload: add 'aborted' reason, cutoff ('soft'|'hard'), elapsed_ms - takeLimits: manual iterator + Promise.race for hard deadline + signal - ndjsonResponse: cancel() hook for Bun.serve() disconnect detection - HTTP handlers: wireDisconnect() for Node outgoing.close + onCancel - Stripe client: AbortSignal.any() combines pipeline + per-request timeout - Subprocess source: child.kill() on signal abort - Distinct log lines: SYNC_TIME_LIMIT_SOFT/HARD, SYNC_CLIENT_DISCONNECT, SYNC_ABORTED - Unit tests for soft/hard cutoff, abort signal, elapsed_ms - Black-box e2e test parameterized over Node/Bun/Docker runtimes - CI: disconnect test steps for Node + Bun Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor * fix: add hono to e2e deps, exclude disconnect test from e2e_stripe job Made-with: Cursor Committed-By-Agent: cursor * fix: improve disconnect test robustness and add error logging - Use syntactically valid postgres URL in pipeline header - Add response body logging on non-200 for CI debugging - Add process exit detection in waitForServer - Increase poll interval for server health checks Made-with: Cursor Committed-By-Agent: cursor * fix: add required schema field to postgres destination config in test Made-with: Cursor Committed-By-Agent: cursor * fix: capture stdout (not just stderr) from engine process Pino logger writes to stdout by default, not stderr. The test was only capturing stderr which was empty, causing all log assertions to fail. Made-with: Cursor Committed-By-Agent: cursor * fix: relax hard time limit e2e assertion for concurrent requests The Stripe connector launches concurrent backfill requests, so the mock request count can be high even with a short time limit. Assert that no new requests arrive after the hard deadline fires instead. Made-with: Cursor Committed-By-Agent: cursor * fix: skip docker disconnect test unless DISCONNECT_TEST_DOCKER is set Docker test builds an image from scratch (~2min+) which exceeds the beforeAll timeout in the main CI job. Only run when explicitly opted in. Made-with: Cursor Committed-By-Agent: cursor * fix: address review findings — retry leak, listener leak, remote signal, docker target 1. AbortError is no longer retryable in withHttpRetry — only TimeoutError is. Previously a pipeline abort would trigger up to 5 retries with exponential backoff, defeating the hard cutoff. 2. takeLimits abort listener is created once outside the loop instead of per-iteration, preventing O(messages) listener accumulation on the AbortSignal. 3. createRemoteEngine now forwards opts.signal to the underlying HTTP fetch so remote-engine callers get real cancellation. 4. Docker test helper uses --target engine (not the default service image) and accepts ENGINE_IMAGE env var for pre-built images. Made-with: Cursor Committed-By-Agent: cursor * test: wire disconnect coverage to explicit Node/Bun/Docker runtimes Run the disconnect suite in the intended CI jobs only, wait for log lines instead of checking the buffer synchronously, and normalize the Docker mock URL for container reachability. The Docker job now runs the disconnect suite against the prebuilt engine image. Made-with: Cursor Committed-By-Agent: cursor * test: run docker disconnect coverage with host networking in CI The Docker disconnect suite now uses host networking in CI so the engine container can reach the host mock server via localhost. Non-CI Docker runs still use bridge mode with host.docker.internal. Made-with: Cursor Committed-By-Agent: cursor * ci: wire esbuild binary path shell test into main test job Rebased v2 added e2e/esbuild-binary-path.test.sh. Hook it back into the main CI test job so the workflow sanity check passes and the shell test actually runs in fresh-install CI. Made-with: Cursor Committed-By-Agent: cursor * fix: move signal out of Zod schema into type extension AbortSignal is not serializable and doesn't belong in a Zod schema that could be used for JSON schema generation or wire validation. Keep it as a TypeScript-only type intersection on SourceReadOptions. Made-with: Cursor Committed-By-Agent: cursor * refactor: move signal out of SourceReadOptions and Source.read params signal is a runtime-only cancellation concern, not a serializable option. It now lives as a separate positional argument on: - Engine.pipeline_read(pipeline, opts?, input?, signal?) - Engine.pipeline_sync(pipeline, opts?, input?, signal?) - Source.read(params, $stdin?, signal?) wireDisconnect renamed to createConnectionAbort — returns the controller instead of accepting startedAt and logging internally. The HTTP handler owns the logging via its own onDisconnect callback. Made-with: Cursor Committed-By-Agent: cursor * feat: wire signal through pipeline_write and Destination.write pipeline_write now respects client disconnect the same way pipeline_read and pipeline_sync do. Signal is threaded as a separate positional arg through Engine.pipeline_write, Destination.write, and the HTTP handler. Made-with: Cursor Committed-By-Agent: cursor * fix: propagate iterator teardown through async iterable helpers - channel: add return() and onReturn hook - split: propagate branch return() back to the source iterator - merge: call return() on child iterators when consumer stops - add direct teardown tests for channel, split, and merge Made-with: Cursor Committed-By-Agent: cursor * fix: restore disconnect log on Node close path and update exec arity test - createConnectionAbort now invokes a pure logging callback before aborting so Node close-driven disconnects emit SYNC_CLIENT_DISCONNECT again - update exec arity test for createSourceFromExec.read(params, stdin?, signal?) Made-with: Cursor Committed-By-Agent: cursor

* feat: real-time per-table sync status Refactor state into SyncState { source, destination, engine } sections, add trace/progress for global aggregates and enriched stream_status for per-stream progress, enrich EOF with combined snapshot, and update the dashboard with live per-table status. Protocol: - SectionState + SyncState with source/destination/engine sections - TraceStreamStatus extended with cumulative/run/window record counts and throughput rates (all optional, backward compat) - TraceProgress for global aggregates (elapsed, throughput, checkpoints) - EofPayload with nested global_progress + stream_progress sections Engine: - trackProgress() pipeline stage counts records (via data-path tap), emits periodic enriched stream_status + trace/progress, enriches EOF - x-source-state header renamed to x-state with backward compat - SourceReadOptions.state now SyncState; extracts .source for connectors Service: - sourceState -> syncState (SyncState) in Temporal workflow - drainMessages captures engine state from enriched EOF Dashboard: - Per-stream table shows status badge + rows synced column - Global stats bar: total rows, throughput, instantaneous rate, elapsed - NDJSON streaming for live updates during active sync Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor * fix: address review findings -- compat shim, dashboard safety, demo state - Add sourceState -> syncState backward compat shim in pipeline-workflow.ts so in-flight Temporal workflows don't lose cursor state on deploy - Remove dashboard auto-sync (POST /pipeline_sync on page view was starting duplicate syncs); replace with polling for now - Update demo scripts and workflow test stubs to pass SyncState shape Made-with: Cursor Committed-By-Agent: cursor * fix: raise confidence with safe progress source and compat - persist latest EOF progress snapshot into the service pipeline record so the dashboard can safely render read-only progress via polling - normalize legacy state shapes inside the engine library, not just over HTTP - accept deprecated X-Source-State as a runtime alias while keeping x-state as the primary header - add focused progress tests and update remaining direct callers/tests to the new SyncState expectations - align dashboard dev config with esnext optimizeDeps so local browser validation works on this machine Made-with: Cursor Committed-By-Agent: cursor * chore: update plans and changelog Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * chore: regenerate openapi specs after v2 merge Adds cutoff, elapsed_ms fields and aborted reason to EofPayload. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * chore: format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…sages (#287) - Move success log inside try block so it only fires on actual success - Re-throw after yielding error trace so engine sees the failure - Fall back to err.code when err.message is empty (Bun pg driver emits ECONNREFUSED in code with no message) - Check err.code in isTransient so connection errors classify correctly - Suppress duplicate error trace in logApiStream when destination already yielded one Committed-By-Agent: claude Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: destination write errors now propagate correctly with useful messages - Move success log inside try block so it only fires on actual success - Re-throw after yielding error trace so engine sees the failure - Fall back to err.code when err.message is empty (Bun pg driver emits ECONNREFUSED in code with no message) - Check err.code in isTransient so connection errors classify correctly - Suppress duplicate error trace in logApiStream when destination already yielded one Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * feat: include aggregated source state in EOF payload Consumers can now read the final accumulated state directly from the EOF message instead of tracking individual source_state messages. The new `state` field (SectionState) aggregates per-stream and global state with last-write-wins semantics, matching the existing persistState behavior. Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor * fix: preserve full sync state in EOF payload The EOF state now starts from the incoming SyncState and applies run-time source and engine updates on top, so resumed and no-op runs return a persistable final state instead of dropping untouched sections. Add regression coverage for state merging, no-op resumes, and engine-only count updates. Made-with: Cursor Committed-By-Agent: cursor --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

#281) * probe segmen count, make estimate, remove backfill concurrency setting * address review: withRateLimit wrapper, smooth segments, probe-as-first-page Change 1: Extract withRateLimit(listFn, rateLimiter) wrapper so rate-limit boilerplate is applied once at construction time. paginateSegment, sequentialBackfillStream, and the probe no longer take rateLimiter as a parameter — structurally impossible to forget. Change 2: Replace discrete 1/10/50 segment tiers with smooth segmentCountFromDensity(timeProgress) = ceil(1/timeProgress), capped at MAX_SEGMENTS=50. Decouple concurrency from segment count — mergeAsync now always uses MAX_CONCURRENCY=15 regardless of segment count. Also adds division-by-zero guard for degenerate ranges. Change 3a: Replace probeSegmentCount with probeAndBuildSegments that returns the first page data alongside the segments. The probe call now uses a created filter (forward-compatible if range narrows later). The caller yields probe records directly — zero wasted API calls. For sparse streams the entire dataset comes from the probe with no re-fetch. Also: export ListResult from openapi barrel, remove buildSegments default parameter, assert probe call shape in coexist test. Made-with: Cursor Committed-By-Agent: cursor * fix: only yield probe data for single-segment streams The probe fetches from the full time range (newest-first), so its items may span multiple segment time ranges. Only yield probe data directly when numSegments === 1 (sparse/single-page streams). For multi-segment backfills, the probe is used purely for density estimation — each segment fetches its own data independently to avoid cursor/range mismatches. Made-with: Cursor Committed-By-Agent: cursor * consolidate backfill tuning constants in rate-limiter.ts Move MAX_SEGMENTS and MAX_CONCURRENCY alongside DEFAULT_MAX_RPS so all three throughput/concurrency knobs live in one file with comments explaining what each controls and how they relate to each other. Made-with: Cursor Committed-By-Agent: cursor --------- Co-authored-by: Tony Xiao <tonyx.ca@gmail.com>

The per-stream record counts were already available in the stream_progress section (via run_record_count). Drop the redundant legacy field from the protocol, pipeline, and progress tracking. Made-with: Cursor Committed-By-Agent: cursor

Replace the dead `incomplete` stream status with the actual failure_type values (transient_error, system_error, config_error, auth_error) in both TraceStreamStatus and EofStreamProgress. Source now emits source_state with the error status when a stream fails, so the caller can persist it. On the next run: - transient_error → retried (same as pending) - system_error / config_error / auth_error → skipped (permanent) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

DANGEROUSLY_VERBOSE_LOGGING captured response bodies but included them in the info-level "request end" log. Now logged separately at debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

The full EOF payload (global_progress + stream_progress) is now logged at info level when the sync completes. Response bodies remain at debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Shows stream counts (complete/in-progress/errored/pending), per-stream row counts for active streams, and error details for failed streams. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Info level now only has: request start, request end, EOF summary + payload. Stream-level started/completed logs moved to debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

- Initialize streamStatus from persisted engine state on resume so streams completed in prior runs show as complete, not running - Save stream status alongside cumulative_record_count in engine state - Never default to 'running' — streams without a known status are excluded from stream_progress and stream_status traces - source_state without prior stream_status infers 'started' not 'running' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

- Engine derives stream status from trace/error failure_type (only overrides non-complete streams) — no extra source messages needed - Seed streamStatus from source state on resume so streams the source skips (already complete/errored) show correct status in progress Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Add an alternative to passing config via x-pipeline/x-state headers: callers can now send Content-Type: application/json with a body like {pipeline, state?, body?} containing native JSON objects (no string escaping). The content-type header determines which mode is used — application/json for the new path, anything else for the existing header + NDJSON behavior. Purely additive, fully backward compatible. Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor

This reverts commit fc8c6ac.

* refactor: move stream cancellation to iterator return Make AsyncIterator.return() the public cancellation contract across the engine, protocol helpers, and Stripe source so early teardown interrupts in-flight waits without threading a public AbortSignal through stream APIs. Constraint: Keep HTTP disconnect handling bridged through ndjson response teardown Rejected: Continue exposing public stream AbortSignal parameters | it duplicates the iterator cancellation surface Confidence: high Scope-risk: broad Made-with: Cursor Committed-By-Agent: cursor * cleanup: drop onCancel from ndjsonResponse and remove DOMException onCancel was redundant — ReadableStream.cancel() already calls iterator.return() for Bun, and signal handles Node.js disconnects. Unify on a single mechanism. Replace the three duplicated getAbortError/DOMException helpers with plain signal.reason (now always a plain Error set by withAbortOnReturn). Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor Made-with: Cursor Committed-By-Agent: cursor * fix: suppress unhandled rejections from fire-and-forget iterator.return() Add .catch(() => {}) to void iterator.return() calls in ndjsonResponse abort listener and pipeline closeIteratorInBackground, preventing unhandled rejections if the upstream iterator rejects during teardown. Made-with: Cursor Committed-By-Agent: cursor

* feat: restore JSON body config mode without breaking CLI paths Reintroduce application/json config bodies for the engine API while keeping legacy header and NDJSON flows working. Fix OpenAPIHono and the OpenAPI-based CLI generation so typed header schemas, streaming endpoints, and connector loading all continue to work with the new spec. Made-with: Cursor Committed-By-Agent: cursor * refactor: centralize mode-specific pipeline access Move JSON-vs-header requiredness into shared helpers so handlers stop merging two conditional code paths into nullable locals. Made-with: Cursor Committed-By-Agent: cursor * fix: tighten JSON content-type validation Keep JSON validation strict for JSON routes while avoiding false positives for NDJSON and JSON-header alternatives. Made-with: Cursor Committed-By-Agent: cursor * fix: align isJsonBody with middleware content-type logic Export isApplicationJsonContentType from hono-zod-openapi and reuse it in app.ts so handler and middleware agree on what counts as JSON body mode. Fixes mixed-case and json-seq false positives/negatives. Made-with: Cursor Committed-By-Agent: cursor

…provements (#290) * fix: improve retry logging, fault-tolerant getAccount, curl trace format Add labels to retry log messages for easier debugging, include error messages in retry output, make getAccount gracefully fall back on failure, and format HTTP trace logs as reproducible curl commands. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude * Adding temp backfill command * scripts: add mitmweb intercept env setup and test Sources mitmweb-env.sh to route Node/Bun/curl traffic through mitmweb on localhost:8080, auto-starting it with the internal upstream proxy if needed. mitmweb-env.test.sh verifies all three runtimes in parallel. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * source-stripe: assert --use-env-proxy when proxy env vars are set Node's built-in fetch (undici) silently bypasses HTTP_PROXY/HTTPS_PROXY without the --use-env-proxy flag. assertUseEnvProxy() throws at startup if a proxy is configured but the flag is absent, preventing silent proxy bypass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * Revert "source-stripe: assert --use-env-proxy when proxy env vars are set" This reverts commit eb4e061. * scripts: standalone assert-use-env-proxy with tests Fails fast if HTTPS_PROXY/HTTP_PROXY is set without --use-env-proxy, preventing silent proxy bypass in Node's built-in fetch (undici). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * scripts: convert assert-use-env-proxy to TypeScript with bun:test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * ts-cli: add assertUseEnvProxy with unit and subprocess tests Throws when HTTPS_PROXY/HTTP_PROXY is set but --use-env-proxy is absent (Node's built-in fetch silently bypasses the proxy without it). Bun is exempt since it always respects proxy env natively. Subprocess tests spawn real node/bun processes to verify the flag check works end-to-end, not just in unit logic. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * ts-cli: rename proxy -> env-proxy, move test next to source Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * ts-cli: move ndjson and config tests next to source files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * scripts: improve mitmweb-env.sh; add mitmweb test to CI - Auto-install mitmproxy via pip if not found - Auto-detect upstream proxy from http_proxy/https_proxy env vars instead of hardcoding the Stripe egress proxy URL; falls back to direct mode in CI and clean environments - Add mitmweb proxy intercept test step to CI test job (httpbin.org) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * ci: move mitmweb test to its own parallel job Avoids adding latency to the main test job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * ts-cli: skip bun subprocess test gracefully when bun is not installed Fixes CI failure in the test job where bun isn't available yet. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * scripts: add ws WebSocket test to mitmweb intercept suite Tests that ws connections route through mitmweb via explicit HttpsProxyAgent (ws ignores HTTP_PROXY and --use-env-proxy). CI mitmweb job now installs pnpm deps so ws/https-proxy-agent resolve. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * source-stripe: replace fetchWithProxy with native fetch + tracedFetch Node 24 --use-env-proxy makes undici/fetch respect HTTPS_PROXY natively, so the manual ProxyAgent dispatcher injection in fetchWithProxy is redundant. Replace all call sites with tracedFetch (curl trace logging only, no proxy logic) and remove withFetchProxy/fetchWithProxy entirely. getHttpsProxyAgentForTarget is kept — ws does not respect --use-env-proxy and still needs an explicit HttpsProxyAgent for WebSocket connections. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * fix: restore getAccount retry and update getAccountCreatedTimestamp test - Restore getAccount to use requestWithRetry (e0fb332 unintentionally removed retry by switching to request directly) - Update test that expected getAccount failure to halt backfill: since getAccountCreatedTimestamp now swallows errors (fault-tolerant), backfill proceeds with STRIPE_LAUNCH_TIMESTAMP fallback instead of failing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Committed-By-Agent: claude * fix: add --use-env-proxy to Docker entrypoints and assertUseEnvProxy at startup - Add --use-env-proxy flag to both engine and service Docker ENTRYPOINT so fetch/undici respects HTTPS_PROXY env var in container deployments - Call assertUseEnvProxy() at startup in both CLI entry points so proxy misconfiguration (proxy set without --use-env-proxy) is caught immediately with a clear error rather than silently bypassing the proxy Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* avoid duplicates in sheets * fix

…concurrent pipelines (#301)

Brings the v2 monorepo rewrite into main, completely replacing the deprecated single-package repo. The resulting tree is identical to v2. Committed-By-Agent: claude

* add more context to sync-engine responses * fix: extract only debug headers and sanitize non-JSON previews Avoid materializing all response headers via Object.fromEntries — use pickDebugHeaders to pluck only the diagnostically relevant keys directly from the Headers object. Escape newlines in non-JSON response previews so HTML error pages don't blow up log lines. Update test assertion to match the new enriched error message format. Made-with: Cursor Committed-By-Agent: cursor --------- Co-authored-by: Tony Xiao <tonyx.ca@gmail.com>

* fix: raise sync-engine serve header limit Committed-By-Agent: codex Co-authored-by: codex <noreply@openai.com> * ci: run docker header size regression Committed-By-Agent: codex Co-authored-by: codex <noreply@openai.com> * test: avoid blocking vitest worker in docker header test Committed-By-Agent: codex Co-authored-by: codex <noreply@openai.com> * refactor: type header size server options Committed-By-Agent: codex Co-authored-by: codex <noreply@openai.com> * ci: avoid duplicate header size test runs Committed-By-Agent: codex Co-authored-by: codex <noreply@openai.com> --------- Co-authored-by: codex <noreply@openai.com>

Move the server startup path into dedicated bin and API server modules so Docker and local dev can start the bundled-only HTTP server without paying the citty/OpenAPI CLI bootstrap cost. Keep the interactive CLI on its own binary while making the engine API export surface side-effect-free. Made-with: Cursor Committed-By-Agent: cursor

tonyxiao and others added 30 commits March 24, 2026 22:08

Debug e2e-plugin-loading step 8: capture and show actual output on fa…

c8288be

…ilure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Bump docker/login-action v3 → v4 (Node.js 24 support)

3cae526

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Fix bin entries: always point to dist/, remove publishConfig.bin

842035a

publishConfig.bin is pnpm-only — npm ignores it and drops .ts bin entries as invalid, leaving published packages with no bin at all. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Simplify publish-packages.sh to use pnpm publish -r

b94ac1e

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Unpublish before publish to avoid already-published errors

b965d75

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Add unpublish script to each package, simplify publish-packages.sh

11d8e4d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Drop --registry flags; .npmrc already routes @stripe:* to STRIPE_NPM_…

f1cef16

…REGISTRY Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Use jq to enumerate packages for unpublish; remove unpublish scripts …

e68d026

…from package.json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

Remove standalone API bin; single CLI entrypoint only

1b2d619

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Rename e2e-plugin-loading.sh → e2e-connector-loading.sh

2c96192

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Committed-By-Agent: claude

tonyxiao and others added 29 commits April 13, 2026 21:59

docs: add TODO to switch to npm trusted publishing (OIDC)

c77814c

Made-with: Cursor Committed-By-Agent: cursor

refactor: rename CI publish jobs for clarity

387c6b1

publish_npm → publish_github_registry (Publish to GitHub Registry) publish_npmjs → promote_to_npm_registry (Promote to npm Registry) Made-with: Cursor Committed-By-Agent: cursor

fix: consistent job ID naming (publish_to_ / promote_to_)

7431144

Made-with: Cursor Committed-By-Agent: cursor

chore: deprecate in favor of stripe/sync-engine

1682d2b

Remove legacy record_count from EOF payload

4d502e5

The per-stream record counts were already available in the stream_progress section (via run_record_count). Drop the redundant legacy field from the protocol, pipeline, and progress tracking. Made-with: Cursor Committed-By-Agent: cursor

feat: log EOF payload at info level

eb5622c

The full EOF payload (global_progress + stream_progress) is now logged at info level when the sync completes. Response bodies remain at debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

fix: downgrade logApiStream started/completed to debug level

7aadc73

Info level now only has: request start, request end, EOF summary + payload. Stream-level started/completed logs moved to debug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Committed-By-Agent: claude

Revert "feat: support pipeline/state config in JSON request body (#291)"

14cb335

This reverts commit fc8c6ac.

avoid duplicates in sheets (#298)

56034d5

* avoid duplicates in sheets * fix

fix: eliminate shared mutable state in Google Sheets destination for …

8eebd7e

…concurrent pipelines (#301)

Merge branch 'v2' into main

960575e

Brings the v2 monorepo rewrite into main, completely replacing the deprecated single-package repo. The resulting tree is identical to v2. Committed-By-Agent: claude

feat: add source vs postgres reconciliation counts script (#304)

14bc44a

tonyxiao closed this Apr 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: split engine serve bootstrap from CLI#322

refactor: split engine serve bootstrap from CLI#322
tonyxiao wants to merge 747 commits intomainfrom
tx/engine-bin-split

tonyxiao commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

tonyxiao commented Apr 18, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants