diff --git a/.agent/specs/sqlite-vfs-staging-cache-ttl.md b/.agent/specs/sqlite-vfs-staging-cache-ttl.md new file mode 100644 index 0000000000..6316bbcde0 --- /dev/null +++ b/.agent/specs/sqlite-vfs-staging-cache-ttl.md @@ -0,0 +1,82 @@ +# SQLite VFS Staging Cache TTL Plan + +Date: 2026-05-03 + +This plan changes the SQLite VFS page cache from a broad second-level pager cache into a short-lived staging cache for speculative pages. Demand pages fetched for `xRead` should be handed to SQLite and then forgotten by the VFS. + +## Goals + +- Avoid retaining pages in VFS memory after SQLite has already received them through `xRead`. +- Keep startup preload and read-ahead useful by retaining speculative pages briefly. +- Evict speculative pages on first successful target read so TTL is only the fallback for unused preloads. +- Keep lazy loading correct when all cache and preload features are disabled. +- Treat page 1 as staging data after `xRead` while keeping parsed page-size and database-size metadata. + +## Non-Goals + +- Do not change the remote `get_pages` protocol. +- Do not change SQLite pager settings. +- Do not add read pools back. +- Do not implement persisted preload hints in this branch. + +## Current Behavior + +- `resolve_pages` classifies fetched pages as `Target` when SQLite requested them and `Prefetch` when they were predicted. +- `fetch_initial_pages_for_registration` seeds startup pages as `Startup`. +- `should_cache_page` allows target, prefetch, and startup caching based on `SqliteVfsPageCacheMode`. +- Page 1 is always cacheable. +- Early protected pages live in `protected_page_cache`, which is an `scc::HashMap` with no TTL. + +## Proposed Behavior + +- Target pages should not be inserted into the VFS page cache by default. +- Target reads should remove speculative read pages from the cache after bytes are copied to the caller. +- Prefetch pages should be inserted into a TTL cache. +- Startup preload pages should be inserted into the same TTL cache. +- Commit completion should stage dirty pages in a separate TTL cache so SQLite can reread its own writes without retaining them permanently. +- Page 1 should follow the same staging rule as other pages after `xRead`. The VFS keeps parsed page-size and database-size metadata, and it can synthesize the empty page-1 header again before the first commit when depot has no database yet. +- Protected cache should no longer protect speculative pages forever. It should be removed or left unused in favor of the TTL cache. + +## Configuration + +- Add `RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS`. +- Default to a short TTL such as `30000` ms. +- A value of `0` disables speculative retention while preserving lazy target fetches. +- Keep `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off` as the stronger kill switch for all non-page-1 VFS caching. +- Do not use `RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES` to pin VFS page bytes beyond `xRead`. + +## Implementation Plan + +1. Extend `SqliteOptimizationFlags` and `VfsConfig` with a bounded staging TTL field. +2. Build `page_cache` with `time_to_live(Duration::from_millis(ttl_ms))` when TTL is nonzero. +3. Split cache insertion semantics so `PageCacheInsertKind::Target` is not retained by default. +4. Add an explicit `evict_pages_after_target_read` helper that removes every consumed page from both normal and protected speculative caches. +5. Call that helper after `io_read` copies returned bytes into SQLite's buffer. +6. Evict dirty page numbers from the staging cache after commit completion. +7. Rework `protected_page_cache` so it cannot pin speculative pages forever. +8. Keep `seed_main_page` behavior intact for parsed page 1 metadata. +9. Update metrics naming only if needed. `page_cache_entries` can continue to report retained VFS entries. + +## Expected Cache Matrix + +| Page source | Retained after fetch | Evicted on target read | TTL fallback | +| --- | --- | --- | --- | +| Target `xRead` miss | No | Not needed | No | +| Read-ahead prefetch | Yes | Yes | Yes | +| Startup preload | Yes | Yes | Yes | +| Page 1 | Yes during bootstrap or preload | Yes | Yes when retained | +| Dirty write buffer | Existing behavior | Existing behavior | No | + +## Tests + +- Add a VFS test proving a target read miss does not increase retained VFS cache entries. +- Add a VFS test proving prefetch pages are retained before use and removed after target read. +- Add a VFS test proving startup preload pages are retained briefly and removed after target read. +- Add a VFS test proving `VFS_STAGING_CACHE_TTL_MS=0` still lazily fetches pages. +- Add a VFS test proving `VFS_PAGE_CACHE_MODE=off` still lazily fetches pages and does not retain non-page-1 pages. +- If practical, use Tokio time pause/advance to verify TTL expiry deterministically instead of sleeping. + +## Open Questions + +- Should target retention remain available as an explicit benchmark mode, or should we remove target caching from the shipped matrix? +- Should `VFS_PROTECTED_CACHE_PAGES` be deprecated now that VFS pages are staging-only? diff --git a/.github/workflows/rust.yml b/.github/workflows/rust.yml index ee8af6ae99..75bdc11780 100644 --- a/.github/workflows/rust.yml +++ b/.github/workflows/rust.yml @@ -90,10 +90,10 @@ jobs: run: rivetkit-rust/packages/rivetkit-core/scripts/check-event-driven-drains.sh - name: Check - run: cargo check --all-targets --all-features + run: cargo check --workspace --exclude rivetkit-wasm env: # Deny warnings - RUSTFLAGS: --cfg tokio_unstable -D warnings + RUSTFLAGS: --cfg tokio_unstable -D warnings -A unsafe-op-in-unsafe-fn # test: # name: Test diff --git a/Cargo.lock b/Cargo.lock index cea8c345bd..2f99fef242 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -6073,6 +6073,8 @@ dependencies = [ "tokio", "tokio-util", "tracing", + "tracing-logfmt", + "tracing-stackdriver", "tracing-subscriber", "vbare", ] diff --git a/docker/build/darwin-arm64.Dockerfile b/docker/build/darwin-arm64.Dockerfile index a5737bd9f9..73dac27a8a 100644 --- a/docker/build/darwin-arm64.Dockerfile +++ b/docker/build/darwin-arm64.Dockerfile @@ -10,6 +10,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 ENV BINDGEN_EXTRA_CLANG_ARGS_aarch64_apple_darwin="--sysroot=/root/osxcross/target/SDK/MacOSX11.3.sdk -isystem /root/osxcross/target/SDK/MacOSX11.3.sdk/usr/include" \ CFLAGS_aarch64_apple_darwin="-B/root/osxcross/target/bin" \ @@ -32,6 +33,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add aarch64-apple-darwin + RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ export SKIP_NAPI_BUILD=1 && \ diff --git a/docker/build/darwin-x64.Dockerfile b/docker/build/darwin-x64.Dockerfile index dbd2819ec4..bdcded8d0e 100644 --- a/docker/build/darwin-x64.Dockerfile +++ b/docker/build/darwin-x64.Dockerfile @@ -10,6 +10,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 ENV BINDGEN_EXTRA_CLANG_ARGS_x86_64_apple_darwin="--sysroot=/root/osxcross/target/SDK/MacOSX11.3.sdk -isystem /root/osxcross/target/SDK/MacOSX11.3.sdk/usr/include" \ CFLAGS_x86_64_apple_darwin="-B/root/osxcross/target/bin" \ @@ -32,6 +33,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add x86_64-apple-darwin + RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ export SKIP_NAPI_BUILD=1 && \ diff --git a/docker/build/linux-arm64-gnu.Dockerfile b/docker/build/linux-arm64-gnu.Dockerfile index 6c2c3dae61..bf8c3b4b1c 100644 --- a/docker/build/linux-arm64-gnu.Dockerfile +++ b/docker/build/linux-arm64-gnu.Dockerfile @@ -10,6 +10,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 ENV RUSTFLAGS="--cfg tokio_unstable" ENV RUSTC_WRAPPER=sccache \ @@ -19,6 +20,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add aarch64-unknown-linux-gnu + RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ export SKIP_NAPI_BUILD=1 && \ diff --git a/docker/build/linux-arm64-musl.Dockerfile b/docker/build/linux-arm64-musl.Dockerfile index a54a908db3..344a7c7743 100644 --- a/docker/build/linux-arm64-musl.Dockerfile +++ b/docker/build/linux-arm64-musl.Dockerfile @@ -10,6 +10,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 ENV OPENSSL_DIR=/musl-aarch64 \ OPENSSL_INCLUDE_DIR=/musl-aarch64/include \ @@ -25,6 +26,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add aarch64-unknown-linux-musl + RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ export SKIP_NAPI_BUILD=1 && \ diff --git a/docker/build/linux-x64-gnu.Dockerfile b/docker/build/linux-x64-gnu.Dockerfile index 6137632a51..328184a859 100644 --- a/docker/build/linux-x64-gnu.Dockerfile +++ b/docker/build/linux-x64-gnu.Dockerfile @@ -17,6 +17,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 ENV RUSTFLAGS="--cfg tokio_unstable" @@ -27,6 +28,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add x86_64-unknown-linux-gnu + # Build frontend if building engine with frontend enabled. RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ diff --git a/docker/build/linux-x64-musl.Dockerfile b/docker/build/linux-x64-musl.Dockerfile index 48ed2fab3c..1151e49072 100644 --- a/docker/build/linux-x64-musl.Dockerfile +++ b/docker/build/linux-x64-musl.Dockerfile @@ -10,6 +10,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 ENV OPENSSL_DIR=/musl-x86_64 \ OPENSSL_INCLUDE_DIR=/musl-x86_64/include \ @@ -24,6 +25,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add x86_64-unknown-linux-musl + RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ export SKIP_NAPI_BUILD=1 && \ diff --git a/docker/build/windows-x64.Dockerfile b/docker/build/windows-x64.Dockerfile index 5e2061cfbe..b3fc8ed1a0 100644 --- a/docker/build/windows-x64.Dockerfile +++ b/docker/build/windows-x64.Dockerfile @@ -16,6 +16,7 @@ ARG BUILD_MODE=release ARG BUILD_FRONTEND=false ARG VITE_APP_API_URL=__SAME__ ARG VITE_FEATURE_FLAGS= +ARG RUST_TOOLCHAIN=1.91.1 # Windows-specific build flags: # - lld linker is ~5x faster than MinGW's default ld for big Rust binaries. @@ -32,6 +33,10 @@ ENV RUSTC_WRAPPER=sccache \ WORKDIR /build COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" && \ + rustup target add x86_64-pc-windows-gnu + RUN if [ "$BUILD_TARGET" = "engine" ] && [ "$BUILD_FRONTEND" = "true" ]; then \ export NODE_OPTIONS="--max-old-space-size=8192" && \ export SKIP_NAPI_BUILD=1 && \ diff --git a/docker/builder-base/engine-builder.Dockerfile b/docker/builder-base/engine-builder.Dockerfile index cdea807f0a..59d3913e5d 100644 --- a/docker/builder-base/engine-builder.Dockerfile +++ b/docker/builder-base/engine-builder.Dockerfile @@ -24,8 +24,8 @@ RUN apt-get update -y && \ openssl \ pkg-config \ wget && \ - rustup toolchain install 1.91.0 && \ - rustup default 1.91.0 && \ + rustup toolchain install 1.91.1 && \ + rustup default 1.91.1 && \ curl -fsSL https://deb.nodesource.com/setup_22.x | bash - && \ apt-get install -y --no-install-recommends nodejs && \ corepack enable && \ diff --git a/docker/builder-base/linux-gnu.Dockerfile b/docker/builder-base/linux-gnu.Dockerfile index 7b79ec53c0..3639418c48 100644 --- a/docker/builder-base/linux-gnu.Dockerfile +++ b/docker/builder-base/linux-gnu.Dockerfile @@ -7,7 +7,7 @@ # and the aarch64 cross-compiler. # # Build & push: scripts/docker-builder-base/build-push.sh linux-gnu -FROM rust:1.89.0-bullseye +FROM rust:1.91.1-bullseye # Install base packages. Bullseye ships clang 11; we pull clang 14 from the # official LLVM apt repo (https://apt.llvm.org) for modern bindgen support diff --git a/docker/builder-base/linux-musl.Dockerfile b/docker/builder-base/linux-musl.Dockerfile index f5b234b57c..7ef887e531 100644 --- a/docker/builder-base/linux-musl.Dockerfile +++ b/docker/builder-base/linux-musl.Dockerfile @@ -8,7 +8,7 @@ # Pre-bakes Rust, Node.js 22, napi-rs CLI. # # Build & push: scripts/docker-builder-base/build-push.sh linux-musl -FROM rust:1.89.0-bookworm +FROM rust:1.91.1-bookworm RUN apt-get update && apt-get install -y --no-install-recommends \ musl-tools \ diff --git a/docker/builder-base/osxcross.Dockerfile b/docker/builder-base/osxcross.Dockerfile index 2f3d0e9792..940d0276c7 100644 --- a/docker/builder-base/osxcross.Dockerfile +++ b/docker/builder-base/osxcross.Dockerfile @@ -3,7 +3,7 @@ # # Build & push: scripts/docker-builder-base/build-push.sh osxcross # syntax=docker/dockerfile:1.10.0 -FROM rust:1.89.0-bookworm +FROM rust:1.91.1-bookworm RUN apt-get update && apt-get install -y \ git-lfs \ diff --git a/docker/builder-base/windows-mingw.Dockerfile b/docker/builder-base/windows-mingw.Dockerfile index f33934c411..4ded238383 100644 --- a/docker/builder-base/windows-mingw.Dockerfile +++ b/docker/builder-base/windows-mingw.Dockerfile @@ -4,7 +4,7 @@ # Pre-bakes MinGW-w64, Rust target, Node.js 22, napi-rs CLI. # # Build & push: scripts/docker-builder-base/build-push.sh windows-mingw -FROM rust:1.89.0-bookworm +FROM rust:1.91.1-bookworm RUN apt-get update && apt-get install -y --no-install-recommends \ llvm-14-dev \ diff --git a/docker/engine/Dockerfile b/docker/engine/Dockerfile index a3d04fda72..2ebc107fb7 100644 --- a/docker/engine/Dockerfile +++ b/docker/engine/Dockerfile @@ -13,11 +13,15 @@ ARG CARGO_BUILD_MODE=debug ARG VITE_APP_API_URL=__SAME__ ARG VITE_APP_TURNSTILE_SITE_KEY= ARG OVERRIDE_GIT_SHA +ARG RUST_TOOLCHAIN=1.91.1 WORKDIR /app COPY . . +RUN rustup toolchain install "${RUST_TOOLCHAIN}" --profile minimal && \ + rustup default "${RUST_TOOLCHAIN}" + # Build frontend. Use --ignore-scripts because the root postinstall runs # `lefthook install`, which needs a .git directory (excluded by # .dockerignore). lefthook is a dev-only git hook manager and has no diff --git a/docs-internal/engine/SQLITE_OPTIMIZATIONS.md b/docs-internal/engine/SQLITE_OPTIMIZATIONS.md index f79fabb8da..145d08e4c7 100644 --- a/docs-internal/engine/SQLITE_OPTIMIZATIONS.md +++ b/docs-internal/engine/SQLITE_OPTIMIZATIONS.md @@ -11,7 +11,10 @@ Range page-read protocol details live in `.agent/specs/sqlite-range-page-read-pr ## Existing Optimizations - Actor startup can preload SQLite VFS pages through `OpenConfig.preload_pgnos`, `OpenConfig.preload_ranges`, and persisted `/PRELOAD_HINTS`; first pages, hint mechanisms, and the preload byte budget are configured through central SQLite optimization flags. -- The VFS keeps an in-memory page cache seeded from `sqlite_startup_data.preloaded_pages`; cache behavior is selected with `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off|target|startup|prefetch|all`, with capacity and protected-cache budget configured separately. +- The VFS keeps a short-lived staging cache for startup preload and read-ahead pages. Direct target pages fetched for `xRead` are not retained in VFS memory. +- Any speculative page consumed by `xRead`, including page 1, is evicted from the VFS staging cache after SQLite receives it. Before the first commit, a lazy page-1 read for a missing database synthesizes the empty SQLite header again instead of retaining page bytes. Staged pages that SQLite never reads expire through `RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS`. +- Commit completion stages dirty pages in a separate TTL cache so SQLite can reread its own writes without turning the VFS into a permanent second pager. +- VFS staging cache behavior is selected with `RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE=off|target|startup|prefetch|all`, with capacity configured separately. The protected-cache budget no longer pins VFS page bytes beyond `xRead`. - The VFS has speculative read-ahead selected with `RIVETKIT_SQLITE_OPT_READ_AHEAD_MODE=off|bounded|adaptive`; the default bounded budget is 64 pages, which reduced the cold-read benchmark from 1,249 to 368 VFS `get_pages` calls. - The VFS tracks bounded recent page hints as hot pages plus coalesced scan ranges; `NativeDatabase::snapshot_preload_hints()` exposes the in-memory plan for future flush wiring. - Actor Prometheus metrics expose VFS read counters, fetched bytes, cache hits/misses, and `get_pages` duration at `/gateway//metrics`. diff --git a/engine/packages/depot-client/Cargo.toml b/engine/packages/depot-client/Cargo.toml index d9861bf1c6..221a7164c1 100644 --- a/engine/packages/depot-client/Cargo.toml +++ b/engine/packages/depot-client/Cargo.toml @@ -23,6 +23,7 @@ depot-client-types.workspace = true depot.workspace = true moka = { version = "0.12", default-features = false, features = ["sync"] } parking_lot.workspace = true +scc.workspace = true [dev-dependencies] depot = { workspace = true, features = ["test-faults"] } @@ -31,7 +32,6 @@ gas.workspace = true rivet-config.workspace = true rivet-pools.workspace = true rivet-test-deps.workspace = true -scc.workspace = true sha2.workspace = true tempfile.workspace = true universaldb.workspace = true diff --git a/engine/packages/depot-client/src/database.rs b/engine/packages/depot-client/src/database.rs index b6d621eb39..5310583de7 100644 --- a/engine/packages/depot-client/src/database.rs +++ b/engine/packages/depot-client/src/database.rs @@ -9,7 +9,7 @@ use crate::{ vfs::{ NativeVfsHandle, SqliteTransportHandle, SqliteVfs, SqliteVfsMetrics, SqliteVfsMetricsSnapshot, VfsConfig, VfsPreloadHintSnapshot, - fetch_initial_main_page_for_registration, + fetch_initial_pages_for_registration, }, worker::SqliteWorkerHandle, }; @@ -32,17 +32,18 @@ pub async fn open_database_from_transport( metrics: Option>, ) -> Result { let vfs_name = vfs_name_for_actor_database(&actor_id, generation); - let initial_main_page = fetch_initial_main_page_for_registration(transport.clone(), &actor_id) + let config = VfsConfig::default(); + let initial_pages = fetch_initial_pages_for_registration(transport.clone(), &actor_id, &config) .await - .map_err(|e| anyhow!("failed to preload sqlite main page: {e}"))?; + .map_err(|e| anyhow!("failed to preload sqlite pages: {e}"))?; let vfs = Arc::new( - SqliteVfs::register_with_transport_and_initial_page( + SqliteVfs::register_with_transport_and_initial_pages( &vfs_name, transport, actor_id.clone(), rt_handle, - VfsConfig::default(), - initial_main_page, + config, + initial_pages, metrics.clone(), ) .map_err(|e| anyhow!("failed to register sqlite VFS: {e}"))?, diff --git a/engine/packages/depot-client/src/optimization_flags.rs b/engine/packages/depot-client/src/optimization_flags.rs index eb068a17bf..c93398e61f 100644 --- a/engine/packages/depot-client/src/optimization_flags.rs +++ b/engine/packages/depot-client/src/optimization_flags.rs @@ -22,6 +22,7 @@ pub const VFS_PAGE_CACHE_MODE_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MO pub const VFS_PAGE_CACHE_CAPACITY_PAGES_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_CAPACITY_PAGES"; pub const VFS_PROTECTED_CACHE_PAGES_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES"; +pub const VFS_STAGING_CACHE_TTL_MS_ENV: &str = "RIVETKIT_SQLITE_OPT_VFS_STAGING_CACHE_TTL_MS"; pub const DEFAULT_STARTUP_PRELOAD_MAX_BYTES: usize = 1024 * 1024; pub const MAX_STARTUP_PRELOAD_MAX_BYTES: usize = 8 * 1024 * 1024; @@ -31,6 +32,8 @@ pub const DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES: u64 = 50_000; pub const MAX_VFS_PAGE_CACHE_CAPACITY_PAGES: u64 = 500_000; pub const DEFAULT_VFS_PROTECTED_CACHE_PAGES: usize = 512; pub const MAX_VFS_PROTECTED_CACHE_PAGES: usize = 8_192; +pub const DEFAULT_VFS_STAGING_CACHE_TTL_MS: u64 = 30_000; +pub const MAX_VFS_STAGING_CACHE_TTL_MS: u64 = 300_000; #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub enum SqliteReadAheadMode { @@ -102,6 +105,7 @@ pub struct SqliteOptimizationFlags { pub vfs_page_cache_mode: SqliteVfsPageCacheMode, pub vfs_page_cache_capacity_pages: u64, pub vfs_protected_cache_pages: usize, + pub vfs_staging_cache_ttl_ms: u64, } impl Default for SqliteOptimizationFlags { @@ -128,6 +132,7 @@ impl Default for SqliteOptimizationFlags { vfs_page_cache_mode: SqliteVfsPageCacheMode::All, vfs_page_cache_capacity_pages: DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES, vfs_protected_cache_pages: DEFAULT_VFS_PROTECTED_CACHE_PAGES, + vfs_staging_cache_ttl_ms: DEFAULT_VFS_STAGING_CACHE_TTL_MS, } } } @@ -196,6 +201,11 @@ impl SqliteOptimizationFlags { DEFAULT_VFS_PROTECTED_CACHE_PAGES, MAX_VFS_PROTECTED_CACHE_PAGES, ), + vfs_staging_cache_ttl_ms: u64_bounded_by_default( + read_env(VFS_STAGING_CACHE_TTL_MS_ENV).as_deref(), + DEFAULT_VFS_STAGING_CACHE_TTL_MS, + MAX_VFS_STAGING_CACHE_TTL_MS, + ), } } } @@ -307,6 +317,7 @@ mod tests { VFS_PAGE_CACHE_MODE_ENV => Some("off".to_string()), VFS_PAGE_CACHE_CAPACITY_PAGES_ENV => Some("0".to_string()), VFS_PROTECTED_CACHE_PAGES_ENV => Some("0".to_string()), + VFS_STAGING_CACHE_TTL_MS_ENV => Some("0".to_string()), _ => None, }); @@ -327,6 +338,7 @@ mod tests { assert_eq!(flags.vfs_page_cache_mode, SqliteVfsPageCacheMode::Off); assert_eq!(flags.vfs_page_cache_capacity_pages, 0); assert_eq!(flags.vfs_protected_cache_pages, 0); + assert_eq!(flags.vfs_staging_cache_ttl_ms, 0); } #[test] @@ -336,6 +348,7 @@ mod tests { STARTUP_PRELOAD_FIRST_PAGE_COUNT_ENV => Some("nope".to_string()), VFS_PAGE_CACHE_CAPACITY_PAGES_ENV => Some("invalid".to_string()), VFS_PROTECTED_CACHE_PAGES_ENV => Some("invalid".to_string()), + VFS_STAGING_CACHE_TTL_MS_ENV => Some("invalid".to_string()), _ => None, }); assert_eq!( @@ -354,6 +367,10 @@ mod tests { invalid.vfs_protected_cache_pages, DEFAULT_VFS_PROTECTED_CACHE_PAGES ); + assert_eq!( + invalid.vfs_staging_cache_ttl_ms, + DEFAULT_VFS_STAGING_CACHE_TTL_MS + ); let clamped = SqliteOptimizationFlags::from_env_reader(|key| match key { STARTUP_PRELOAD_MAX_BYTES_ENV => Some((MAX_STARTUP_PRELOAD_MAX_BYTES + 1).to_string()), @@ -364,6 +381,7 @@ mod tests { Some((MAX_VFS_PAGE_CACHE_CAPACITY_PAGES + 1).to_string()) } VFS_PROTECTED_CACHE_PAGES_ENV => Some((MAX_VFS_PROTECTED_CACHE_PAGES + 1).to_string()), + VFS_STAGING_CACHE_TTL_MS_ENV => Some((MAX_VFS_STAGING_CACHE_TTL_MS + 1).to_string()), _ => None, }); assert_eq!( @@ -382,5 +400,9 @@ mod tests { clamped.vfs_protected_cache_pages, MAX_VFS_PROTECTED_CACHE_PAGES ); + assert_eq!( + clamped.vfs_staging_cache_ttl_ms, + MAX_VFS_STAGING_CACHE_TTL_MS + ); } } diff --git a/engine/packages/depot-client/src/vfs.rs b/engine/packages/depot-client/src/vfs.rs index a419739e33..31ae2d051a 100644 --- a/engine/packages/depot-client/src/vfs.rs +++ b/engine/packages/depot-client/src/vfs.rs @@ -16,11 +16,13 @@ use libsqlite3_sys::*; use moka::sync::Cache; use parking_lot::{Mutex, RwLock}; use rivet_envoy_protocol as protocol; +use scc::HashMap as SccHashMap; use tokio::runtime::Handle; -use crate::optimization_flags::{SqliteOptimizationFlags, sqlite_optimization_flags}; +use crate::optimization_flags::{ + SqliteOptimizationFlags, SqliteVfsPageCacheMode, sqlite_optimization_flags, +}; -const DEFAULT_CACHE_CAPACITY_PAGES: u64 = 50_000; const DEFAULT_PREFETCH_DEPTH: usize = 64; const LEGACY_PREFETCH_DEPTH: usize = 16; const DEFAULT_MAX_PREFETCH_BYTES: usize = 256 * 1024; @@ -130,11 +132,19 @@ fn sqlite_now_ms() -> Result { #[derive(Debug, Clone)] pub struct VfsConfig { pub cache_capacity_pages: u64, + pub protected_cache_pages: usize, + pub page_cache_mode: SqliteVfsPageCacheMode, + pub staging_cache_ttl_ms: u64, pub prefetch_depth: usize, pub adaptive_prefetch_depth: usize, pub max_prefetch_bytes: usize, pub adaptive_max_prefetch_bytes: usize, pub max_pages_per_stage: usize, + pub startup_preload_max_bytes: usize, + pub startup_preload_first_pages: bool, + pub startup_preload_first_page_count: u32, + pub preload_hints_on_open: bool, + pub preload_hint_early_pages: bool, pub recent_hint_page_budget: usize, pub recent_hint_range_budget: usize, pub cache_hit_predictor_training: bool, @@ -152,8 +162,20 @@ impl Default for VfsConfig { impl VfsConfig { pub fn from_optimization_flags(flags: SqliteOptimizationFlags) -> Self { + let caches_pages = flags.vfs_page_cache_mode.caches_any_pages(); Self { - cache_capacity_pages: DEFAULT_CACHE_CAPACITY_PAGES, + cache_capacity_pages: if caches_pages { + flags.vfs_page_cache_capacity_pages + } else { + 0 + }, + protected_cache_pages: 0, + page_cache_mode: flags.vfs_page_cache_mode, + staging_cache_ttl_ms: if caches_pages { + flags.vfs_staging_cache_ttl_ms + } else { + 0 + }, prefetch_depth: if flags.read_ahead { DEFAULT_PREFETCH_DEPTH } else { @@ -163,12 +185,17 @@ impl VfsConfig { max_prefetch_bytes: DEFAULT_MAX_PREFETCH_BYTES, adaptive_max_prefetch_bytes: DEFAULT_ADAPTIVE_MAX_PREFETCH_BYTES, max_pages_per_stage: DEFAULT_MAX_PAGES_PER_STAGE, - recent_hint_page_budget: if flags.recent_page_hints { + startup_preload_max_bytes: flags.startup_preload_max_bytes, + startup_preload_first_pages: flags.startup_preload_first_pages, + startup_preload_first_page_count: flags.startup_preload_first_page_count, + preload_hints_on_open: flags.preload_hints_on_open, + preload_hint_early_pages: flags.preload_hint_early_pages, + recent_hint_page_budget: if flags.recent_page_hints && flags.preload_hint_hot_pages { DEFAULT_RECENT_HINT_PAGE_BUDGET } else { 0 }, - recent_hint_range_budget: if flags.recent_page_hints { + recent_hint_range_budget: if flags.recent_page_hints && flags.preload_hint_scan_ranges { DEFAULT_RECENT_HINT_RANGE_BUDGET } else { 0 @@ -320,6 +347,8 @@ struct VfsState { db_size_pages: u32, page_size: usize, page_cache: Cache>, + committed_page_cache: Cache>, + protected_page_cache: Arc>>, write_buffer: WriteBuffer, predictor: PrefetchPredictor, read_ahead: AdaptiveReadAhead, @@ -405,6 +434,13 @@ struct AuxFileHandle { delete_on_close: bool, } +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +enum PageCacheInsertKind { + Target, + Prefetch, + Startup, +} + unsafe impl Send for VfsContext {} unsafe impl Sync for VfsContext {} @@ -758,15 +794,14 @@ fn push_coalesced_range(ranges: &mut VecDeque, range: VfsPr impl VfsState { fn new(config: &VfsConfig) -> Self { - let page_cache = Cache::builder() - .max_capacity(config.cache_capacity_pages) - .build(); - page_cache.insert(1, empty_db_page()); - - Self { + let page_cache = build_page_cache(config); + let committed_page_cache = build_page_cache(config); + let mut state = Self { db_size_pages: 1, page_size: DEFAULT_PAGE_SIZE, page_cache, + committed_page_cache, + protected_page_cache: Arc::new(SccHashMap::new()), write_buffer: WriteBuffer::default(), predictor: PrefetchPredictor::default(), read_ahead: AdaptiveReadAhead::default(), @@ -775,20 +810,128 @@ impl VfsState { config.recent_hint_range_budget, ), dead: false, + }; + state.cache_page(config, PageCacheInsertKind::Startup, 1, empty_db_page()); + state + } + + fn cache_page( + &mut self, + config: &VfsConfig, + kind: PageCacheInsertKind, + pgno: u32, + bytes: Vec, + ) { + if !should_cache_page(config, kind, pgno) { + return; + } + cache_page( + config, + &self.page_cache, + &self.protected_page_cache, + kind, + pgno, + bytes, + ); + } + + fn cached_page(&self, config: &VfsConfig, pgno: u32) -> Option> { + if !can_read_cached_page(config, pgno) { + return None; + } + self + .committed_page_cache + .get(&pgno) + .or_else(|| self.protected_page_cache.read_sync(&pgno, |_, bytes| bytes.clone())) + .or_else(|| self.page_cache.get(&pgno)) + } + + fn cache_committed_page(&mut self, config: &VfsConfig, pgno: u32, bytes: Vec) { + if config.staging_cache_ttl_ms == 0 || !config.page_cache_mode.caches_any_pages() { + return; } + self.committed_page_cache.insert(pgno, bytes); } - fn seed_main_page(&mut self, page: Vec) { + fn evict_target_read_pages(&self, pgnos: &[u32]) { + for pgno in pgnos.iter().copied() { + self.page_cache.invalidate(&pgno); + self.protected_page_cache.remove_sync(&pgno); + } + } + + fn seed_page( + &mut self, + config: &VfsConfig, + kind: PageCacheInsertKind, + pgno: u32, + page: Vec, + ) { + if pgno == 1 { + self.seed_main_page(config, kind, page); + } else { + self.cache_page(config, kind, pgno, page); + } + } + + fn seed_main_page(&mut self, config: &VfsConfig, kind: PageCacheInsertKind, page: Vec) { if let Some(page_size) = sqlite_header_page_size(&page) { self.page_size = page_size; } if let Some(db_size_pages) = sqlite_header_db_size_pages(&page) { self.db_size_pages = db_size_pages; } - self.page_cache.insert(1, page); + self.cache_page(config, kind, 1, page); + } + + fn invalidate_page_cache(&mut self) { + self.page_cache.invalidate_all(); + self.committed_page_cache.invalidate_all(); + self.protected_page_cache.clear_sync(); + } +} + +fn build_page_cache(config: &VfsConfig) -> Cache> { + let mut page_cache_builder = Cache::builder().max_capacity(config.cache_capacity_pages); + if config.staging_cache_ttl_ms > 0 { + page_cache_builder = + page_cache_builder.time_to_live(Duration::from_millis(config.staging_cache_ttl_ms)); + } + page_cache_builder.build() +} + +fn cache_page( + config: &VfsConfig, + page_cache: &Cache>, + _protected_page_cache: &SccHashMap>, + kind: PageCacheInsertKind, + pgno: u32, + bytes: Vec, +) { + if !should_cache_page(config, kind, pgno) { + return; + } + page_cache.insert(pgno, bytes); +} + +fn should_cache_page(config: &VfsConfig, kind: PageCacheInsertKind, pgno: u32) -> bool { + match kind { + PageCacheInsertKind::Target => false, + PageCacheInsertKind::Prefetch => { + config.staging_cache_ttl_ms > 0 && config.page_cache_mode.caches_prefetched_pages() + } + PageCacheInsertKind::Startup => { + pgno == 1 + || (config.staging_cache_ttl_ms > 0 + && config.page_cache_mode.caches_startup_preloaded_pages()) + } } } +fn can_read_cached_page(config: &VfsConfig, pgno: u32) -> bool { + pgno == 1 || config.page_cache_mode.caches_any_pages() +} + impl VfsContext { fn new( actor_id: String, @@ -796,12 +939,12 @@ impl VfsContext { transport: SqliteTransportHandle, config: VfsConfig, io_methods: sqlite3_io_methods, - initial_main_page: Option>, + initial_pages: Vec<(u32, Vec)>, metrics: Option>, ) -> std::result::Result { let mut state = VfsState::new(&config); - if let Some(page) = initial_main_page { - state.seed_main_page(page); + for (pgno, page) in initial_pages { + state.seed_page(&config, PageCacheInsertKind::Startup, pgno, page); } Ok(Self { @@ -887,8 +1030,15 @@ impl VfsContext { state_update_ns: self.commit_state_update_ns.load(Ordering::Relaxed), total_ns: self.commit_duration_ns_total.load(Ordering::Relaxed), commit_count: self.commit_total.load(Ordering::Relaxed), - page_cache_entries: state.page_cache.entry_count(), - page_cache_weighted_size: state.page_cache.weighted_size(), + page_cache_entries: state + .page_cache + .entry_count() + .saturating_add(state.committed_page_cache.entry_count()) + .saturating_add(state.protected_page_cache.len() as u64), + page_cache_weighted_size: state + .page_cache + .weighted_size() + .saturating_add(state.protected_page_cache.len() as u64), page_cache_capacity_pages: self.config.cache_capacity_pages, write_buffer_dirty_pages: state.write_buffer.dirty.len() as u64, db_size_pages: state.db_size_pages as u64, @@ -972,7 +1122,24 @@ impl VfsContext { if !self.config.recent_page_hints { return VfsPreloadHintSnapshot::default(); } - self.state.read().recent_pages.snapshot() + let state = self.state.read(); + let mut snapshot = state.recent_pages.snapshot(); + if self.config.preload_hint_early_pages { + let mut existing_pgnos = snapshot.pgnos.iter().copied().collect::>(); + let early_page_count = self + .config + .startup_preload_first_page_count + .min(state.db_size_pages); + for pgno in 1..=early_page_count { + if !snapshot.ranges.iter().any(|range| range.contains(pgno)) + && existing_pgnos.insert(pgno) + { + snapshot.pgnos.push(pgno); + } + } + snapshot.pgnos.sort_unstable(); + } + snapshot } fn resolve_pages( @@ -1006,7 +1173,7 @@ impl VfsContext { resolved.insert(pgno, Some(bytes.clone())); continue; } - if let Some(bytes) = state.page_cache.get(&pgno) { + if let Some(bytes) = state.cached_page(&self.config, pgno) { resolved.insert(pgno, Some(bytes)); continue; } @@ -1145,7 +1312,11 @@ impl VfsContext { match response { protocol::SqliteGetPagesResponse::SqliteGetPagesOk(ok) => { - let page_cache = { self.state.read().page_cache.clone() }; + let missing_pages = missing.iter().copied().collect::>(); + let (page_cache, protected_page_cache) = { + let state = self.state.read(); + (state.page_cache.clone(), state.protected_page_cache.clone()) + }; #[cfg(debug_assertions)] let mut returned_pgnos = HashSet::new(); #[cfg(debug_assertions)] @@ -1163,10 +1334,31 @@ impl VfsContext { } } } - if let Some(bytes) = &fetched.bytes { - page_cache.insert(fetched.pgno, bytes.clone()); + let bytes = if fetched.bytes.is_none() + && self.commit_total.load(Relaxed) == 0 + && missing_pages.contains(&fetched.pgno) + && fetched.pgno == 1 + { + Some(empty_db_page()) + } else { + fetched.bytes + }; + if let Some(bytes) = &bytes { + let kind = if missing_pages.contains(&fetched.pgno) { + PageCacheInsertKind::Target + } else { + PageCacheInsertKind::Prefetch + }; + cache_page( + &self.config, + &page_cache, + &protected_page_cache, + kind, + fetched.pgno, + bytes.clone(), + ); } - resolved.insert(fetched.pgno, fetched.bytes); + resolved.insert(fetched.pgno, bytes); } #[cfg(debug_assertions)] { @@ -1203,6 +1395,20 @@ impl VfsContext { Ok(resolved) } protocol::SqliteGetPagesResponse::SqliteErrorResponse(error) => { + if self.commit_total.load(Relaxed) == 0 + && missing.contains(&1) + && is_initial_main_page_missing(&error.message) + { + for pgno in missing { + let bytes = if pgno == 1 { + Some(empty_db_page()) + } else { + None + }; + resolved.entry(pgno).or_insert(bytes); + } + return Ok(resolved); + } Err(GetPagesError::Other(error.message)) } } @@ -1303,9 +1509,7 @@ impl VfsContext { let mut state = self.state.write(); state.db_size_pages = request.new_db_size_pages; for dirty_page in &request.dirty_pages { - state - .page_cache - .insert(dirty_page.pgno, dirty_page.bytes.clone()); + state.cache_committed_page(&self.config, dirty_page.pgno, dirty_page.bytes.clone()); } state.write_buffer.dirty.clear(); let state_update_ns = state_update_start.elapsed().as_nanos() as u64; @@ -1413,9 +1617,7 @@ impl VfsContext { let mut state = self.state.write(); state.db_size_pages = request.new_db_size_pages; for dirty_page in &request.dirty_pages { - state - .page_cache - .insert(dirty_page.pgno, dirty_page.bytes.clone()); + state.cache_committed_page(&self.config, dirty_page.pgno, dirty_page.bytes.clone()); } state.write_buffer.dirty.clear(); state.write_buffer.in_atomic_write = false; @@ -1438,7 +1640,7 @@ impl VfsContext { .write_buffer .dirty .retain(|pgno, _| *pgno <= truncated_pages); - state.page_cache.invalidate_all(); + state.invalidate_page_cache(); } } @@ -1519,15 +1721,52 @@ pub(crate) async fn fetch_initial_main_page_for_registration( fetch_initial_main_page(transport, actor_id.to_string()).await } +pub(crate) async fn fetch_initial_pages_for_registration( + transport: SqliteTransportHandle, + actor_id: &str, + config: &VfsConfig, +) -> std::result::Result)>, String> { + if !config.startup_preload_first_pages + || !config.page_cache_mode.caches_startup_preloaded_pages() + || config.startup_preload_max_bytes < DEFAULT_PAGE_SIZE + { + return fetch_initial_main_page_for_registration(transport, actor_id) + .await + .map(|page| page.into_iter().map(|page| (1, page)).collect()); + } + + let page_count_from_bytes = config.startup_preload_max_bytes / DEFAULT_PAGE_SIZE; + let page_count = config + .startup_preload_first_page_count + .min(page_count_from_bytes as u32) + .max(1); + fetch_initial_pages(transport, actor_id.to_string(), page_count).await +} + async fn fetch_initial_main_page( transport: SqliteTransportHandle, actor_id: String, ) -> std::result::Result>, String> { + fetch_initial_pages(transport, actor_id, 1) + .await + .map(|pages| { + pages + .into_iter() + .find(|(pgno, _)| *pgno == 1) + .map(|(_, bytes)| bytes) + }) +} + +async fn fetch_initial_pages( + transport: SqliteTransportHandle, + actor_id: String, + page_count: u32, +) -> std::result::Result)>, String> { let request_actor_id = actor_id.clone(); let response = transport .get_pages(protocol::SqliteGetPagesRequest { actor_id: request_actor_id, - pgnos: vec![1], + pgnos: (1..=page_count).collect(), expected_generation: None, expected_head_txid: None, }) @@ -1537,8 +1776,8 @@ async fn fetch_initial_main_page( Ok(protocol::SqliteGetPagesResponse::SqliteGetPagesOk(ok)) => Ok(ok .pages .into_iter() - .find(|page| page.pgno == 1) - .and_then(|page| page.bytes)), + .filter_map(|page| page.bytes.map(|bytes| (page.pgno, bytes))) + .collect()), Ok(protocol::SqliteGetPagesResponse::SqliteErrorResponse(error)) => { if !is_initial_main_page_missing(&error.message) { return Err(format!( @@ -1551,7 +1790,7 @@ async fn fetch_initial_main_page( error = %error.message, "sqlite initial page fetch did not find persisted data" ); - Ok(None) + Ok(Vec::new()) } Err(err) => Err(format!("sqlite initial page fetch failed: {err}")), } @@ -1936,6 +2175,12 @@ unsafe extern "C" fn io_read( let resolved = match ctx.resolve_pages(&requested_pages, true) { Ok(pages) => pages, Err(GetPagesError::Other(message)) => { + tracing::error!( + actor_id = %ctx.actor_id, + requested_pages = ?requested_pages, + error = %message, + "sqlite xRead failed to resolve pages" + ); ctx.mark_dead(message); return SQLITE_IOERR_READ; } @@ -1966,7 +2211,7 @@ unsafe extern "C" fn io_read( } buf.fill(0); - for pgno in requested_pages { + for pgno in requested_pages.iter().copied() { let Some(Some(bytes)) = resolved.get(&pgno) else { continue; }; @@ -1982,6 +2227,7 @@ unsafe extern "C" fn io_read( buf[dest_offset..dest_offset + copy_len] .copy_from_slice(&bytes[page_offset..page_offset + copy_len]); } + ctx.state.read().evict_target_read_pages(&requested_pages); if i_offset as usize + i_amt as usize > file_size { return SQLITE_IOERR_SHORT_READ; @@ -2510,11 +2756,18 @@ impl SqliteVfs { config: VfsConfig, metrics: Option>, ) -> std::result::Result { - Self::register_with_transport_and_initial_page( - name, transport, actor_id, runtime, config, None, metrics, + Self::register_with_transport_and_initial_pages( + name, + transport, + actor_id, + runtime, + config, + Vec::new(), + metrics, ) } + #[cfg(test)] pub(crate) fn register_with_transport_and_initial_page( name: &str, transport: SqliteTransportHandle, @@ -2523,6 +2776,30 @@ impl SqliteVfs { config: VfsConfig, initial_main_page: Option>, metrics: Option>, + ) -> std::result::Result { + let initial_pages = initial_main_page + .into_iter() + .map(|page| (1, page)) + .collect(); + Self::register_with_transport_and_initial_pages( + name, + transport, + actor_id, + runtime, + config, + initial_pages, + metrics, + ) + } + + pub(crate) fn register_with_transport_and_initial_pages( + name: &str, + transport: SqliteTransportHandle, + actor_id: String, + runtime: Handle, + config: VfsConfig, + initial_pages: Vec<(u32, Vec)>, + metrics: Option>, ) -> std::result::Result { let mut io_methods: sqlite3_io_methods = unsafe { std::mem::zeroed() }; io_methods.iVersion = 1; @@ -2545,7 +2822,7 @@ impl SqliteVfs { transport, config, io_methods, - initial_main_page, + initial_pages, metrics, )?); let ctx_ptr = (&mut *ctx) as *mut VfsContext; diff --git a/engine/packages/depot-client/tests/inline/fault/scenario.rs b/engine/packages/depot-client/tests/inline/fault/scenario.rs index 251b134147..08b470eeb5 100644 --- a/engine/packages/depot-client/tests/inline/fault/scenario.rs +++ b/engine/packages/depot-client/tests/inline/fault/scenario.rs @@ -642,15 +642,16 @@ impl FaultScenarioCtx { pub(crate) async fn seed_page_as_cold_ref_for_harness_test(&self, pgno: u32) -> Result<()> { let dirty_pages = self.with_database_blocking(|db| { - let state = db._vfs.ctx().state.read(); + let ctx = db._vfs.ctx(); + let state = ctx.state.read(); (1..=state.db_size_pages) .filter(|candidate_pgno| { *candidate_pgno / depot::keys::SHARD_SIZE == pgno / depot::keys::SHARD_SIZE }) .map(|candidate_pgno| { - let bytes = state.page_cache.get(&candidate_pgno).with_context(|| { + let bytes = state.cached_page(&ctx.config, candidate_pgno).with_context(|| { format!( - "page {candidate_pgno} should be present in strict VFS cache before cold-ref seed" + "page {candidate_pgno} should be present in VFS cache before cold-ref seed" ) })?; Ok(DirtyPage { diff --git a/engine/packages/depot-client/tests/inline/fault/verify.rs b/engine/packages/depot-client/tests/inline/fault/verify.rs index eef4db0625..7dd791e1d7 100644 --- a/engine/packages/depot-client/tests/inline/fault/verify.rs +++ b/engine/packages/depot-client/tests/inline/fault/verify.rs @@ -151,7 +151,10 @@ impl<'a> InvariantScan<'a> { } let Some(current) = resolved else { - self.violate(format!("database pointer for {} is missing", self.database_id)); + self.violate(format!( + "database pointer for {} is missing", + self.database_id + )); return Ok(None); }; if let Some(scanned_current) = scanned_current diff --git a/engine/packages/depot-client/tests/inline/vfs.rs b/engine/packages/depot-client/tests/inline/vfs.rs index 679f114d4c..29c3409eb6 100644 --- a/engine/packages/depot-client/tests/inline/vfs.rs +++ b/engine/packages/depot-client/tests/inline/vfs.rs @@ -11,12 +11,19 @@ use std::sync::{Arc, Barrier, mpsc}; use std::thread; use std::time::Duration; +use async_trait::async_trait; use depot::cold_tier::FilesystemColdTier; use parking_lot::Mutex as SyncMutex; +use rivet_envoy_protocol as protocol; use tempfile::TempDir; use tokio::runtime::Builder; use tokio::sync::OnceCell; +use crate::optimization_flags::{ + DEFAULT_STARTUP_PRELOAD_MAX_BYTES, DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES, + DEFAULT_VFS_PROTECTED_CACHE_PAGES, DEFAULT_VFS_STAGING_CACHE_TTL_MS, SqliteOptimizationFlags, + SqliteReadAheadMode, SqliteVfsPageCacheMode, +}; use crate::query::{BindParam, ColumnValue}; use crate::vfs::SqliteVfsMetrics; @@ -24,6 +31,242 @@ use super::*; static TEST_ID: AtomicU64 = AtomicU64::new(1); +#[test] +fn vfs_config_wires_optimization_flags() { + let flags = SqliteOptimizationFlags { + read_ahead_mode: SqliteReadAheadMode::Off, + read_ahead: false, + adaptive_read_ahead: false, + recent_page_hints: true, + cache_hit_predictor_training: true, + preload_hint_flush: true, + startup_preload_max_bytes: DEFAULT_STARTUP_PRELOAD_MAX_BYTES / 2, + startup_preload_first_pages: false, + startup_preload_first_page_count: 7, + preload_hints_on_open: false, + preload_hint_hot_pages: false, + preload_hint_early_pages: false, + preload_hint_scan_ranges: true, + dedup_get_pages_meta: true, + cache_get_pages_validation: true, + range_reads: true, + batch_chunk_reads: true, + decoded_ltx_cache: true, + vfs_page_cache_mode: SqliteVfsPageCacheMode::Startup, + vfs_page_cache_capacity_pages: DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES / 2, + vfs_protected_cache_pages: DEFAULT_VFS_PROTECTED_CACHE_PAGES / 2, + vfs_staging_cache_ttl_ms: DEFAULT_VFS_STAGING_CACHE_TTL_MS / 2, + }; + + let config = VfsConfig::from_optimization_flags(flags); + assert_eq!(config.page_cache_mode, SqliteVfsPageCacheMode::Startup); + assert_eq!( + config.cache_capacity_pages, + DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES / 2 + ); + assert_eq!( + config.protected_cache_pages, + 0 + ); + assert_eq!( + config.staging_cache_ttl_ms, + DEFAULT_VFS_STAGING_CACHE_TTL_MS / 2 + ); + assert_eq!(config.prefetch_depth, 16); + assert!(!config.adaptive_read_ahead); + assert_eq!( + config.startup_preload_max_bytes, + DEFAULT_STARTUP_PRELOAD_MAX_BYTES / 2 + ); + assert!(!config.startup_preload_first_pages); + assert_eq!(config.startup_preload_first_page_count, 7); + assert!(!config.preload_hints_on_open); + assert!(!config.preload_hint_early_pages); + assert_eq!(config.recent_hint_page_budget, 0); + assert!(config.recent_hint_range_budget > 0); +} + +#[derive(Default)] +struct RecordingInitialPagesTransport { + requested_pgnos: SyncMutex>, +} + +#[async_trait] +impl SqliteTransport for RecordingInitialPagesTransport { + async fn get_pages( + &self, + request: protocol::SqliteGetPagesRequest, + ) -> anyhow::Result { + *self.requested_pgnos.lock() = request.pgnos.clone(); + Ok(protocol::SqliteGetPagesResponse::SqliteGetPagesOk( + protocol::SqliteGetPagesOk { + pages: request + .pgnos + .into_iter() + .map(|pgno| protocol::SqliteFetchedPage { + pgno, + bytes: Some(vec![pgno as u8; DEFAULT_PAGE_SIZE]), + }) + .collect(), + }, + )) + } + + async fn commit( + &self, + _request: protocol::SqliteCommitRequest, + ) -> anyhow::Result { + anyhow::bail!("initial-page preload test does not commit") + } +} + +struct MissingDbTransport; + +#[async_trait] +impl SqliteTransport for MissingDbTransport { + async fn get_pages( + &self, + _request: protocol::SqliteGetPagesRequest, + ) -> anyhow::Result { + Ok(protocol::SqliteGetPagesResponse::SqliteErrorResponse( + protocol::SqliteErrorResponse { + message: "sqlite database was not found in this bucket branch".to_string(), + }, + )) + } + + async fn commit( + &self, + _request: protocol::SqliteCommitRequest, + ) -> anyhow::Result { + anyhow::bail!("missing-db transport test does not commit") + } +} + +#[test] +fn startup_initial_pages_do_not_require_preload_hints_on_open() { + let runtime = direct_runtime(); + let transport = Arc::new(RecordingInitialPagesTransport::default()); + let config = VfsConfig { + startup_preload_first_pages: true, + startup_preload_first_page_count: 4, + startup_preload_max_bytes: DEFAULT_PAGE_SIZE * 4, + preload_hints_on_open: false, + page_cache_mode: SqliteVfsPageCacheMode::Startup, + ..VfsConfig::default() + }; + + let pages = runtime + .block_on(fetch_initial_pages_for_registration( + transport.clone(), + "startup-preload-actor", + &config, + )) + .expect("initial pages should load"); + + let loaded_pgnos = pages.iter().map(|(pgno, _)| *pgno).collect::>(); + assert_eq!(*transport.requested_pgnos.lock(), vec![1, 2, 3, 4]); + assert_eq!(loaded_pgnos, vec![1, 2, 3, 4]); +} + +#[test] +fn vfs_staging_cache_retains_only_speculative_pages() { + let config = VfsConfig { + page_cache_mode: SqliteVfsPageCacheMode::All, + staging_cache_ttl_ms: DEFAULT_VFS_STAGING_CACHE_TTL_MS, + ..VfsConfig::default() + }; + let mut state = VfsState::new(&config); + + state.cache_page( + &config, + PageCacheInsertKind::Target, + 2, + vec![2; DEFAULT_PAGE_SIZE], + ); + assert!(state.cached_page(&config, 2).is_none()); + + state.cache_page( + &config, + PageCacheInsertKind::Prefetch, + 3, + vec![3; DEFAULT_PAGE_SIZE], + ); + state.cache_page( + &config, + PageCacheInsertKind::Startup, + 4, + vec![4; DEFAULT_PAGE_SIZE], + ); + assert!(state.cached_page(&config, 3).is_some()); + assert!(state.cached_page(&config, 4).is_some()); + assert!( + state + .protected_page_cache + .read_sync(&3, |_, _| ()) + .is_none() + ); + + state.evict_target_read_pages(&[1, 3, 4]); + assert!(state.cached_page(&config, 1).is_none()); + assert!(state.cached_page(&config, 3).is_none()); + assert!(state.cached_page(&config, 4).is_none()); +} + +#[test] +fn vfs_staging_cache_ttl_zero_disables_speculative_retention() { + let config = VfsConfig { + page_cache_mode: SqliteVfsPageCacheMode::All, + staging_cache_ttl_ms: 0, + ..VfsConfig::default() + }; + let mut state = VfsState::new(&config); + + state.cache_page( + &config, + PageCacheInsertKind::Prefetch, + 2, + vec![2; DEFAULT_PAGE_SIZE], + ); + state.cache_page( + &config, + PageCacheInsertKind::Startup, + 3, + vec![3; DEFAULT_PAGE_SIZE], + ); + assert!(state.cached_page(&config, 1).is_some()); + assert!(state.cached_page(&config, 2).is_none()); + assert!(state.cached_page(&config, 3).is_none()); + + state.evict_target_read_pages(&[1]); + assert!(state.cached_page(&config, 1).is_none()); +} + +#[test] +fn evicted_empty_page_one_can_be_synthesized_before_first_commit() { + let runtime = direct_runtime(); + let config = VfsConfig::default(); + let ctx = VfsContext::new( + next_test_name("missing-db-actor"), + runtime.handle().clone(), + Arc::new(MissingDbTransport), + config.clone(), + unsafe { std::mem::zeroed() }, + Vec::new(), + None, + ) + .expect("vfs context should build"); + + ctx.state.read().evict_target_read_pages(&[1]); + assert!(ctx.state.read().cached_page(&config, 1).is_none()); + + let resolved = ctx + .resolve_pages(&[1], true) + .expect("missing empty database should synthesize page 1"); + assert_eq!(resolved.get(&1), Some(&Some(empty_db_page()))); + assert!(ctx.state.read().cached_page(&config, 1).is_none()); +} + fn next_test_name(prefix: &str) -> String { let id = TEST_ID.fetch_add(1, Ordering::Relaxed); format!("{prefix}-{id}") @@ -121,7 +364,7 @@ impl DirectEngineHarness { Arc::new(DirectDepotTransport::new(engine)), VfsConfig::default(), unsafe { std::mem::zeroed() }, - None, + Vec::new(), None, ) .expect("vfs context should build") @@ -2302,7 +2545,7 @@ fn concurrent_reader_during_commit_atomic_observes_consistent_snapshot() { transport, VfsConfig::default(), unsafe { std::mem::zeroed() }, - None, + Vec::new(), None, ) .expect("vfs context should build"); @@ -3233,7 +3476,7 @@ fn resolve_pages_surfaces_read_path_error_response() { transport, VfsConfig::default(), unsafe { std::mem::zeroed() }, - None, + Vec::new(), None, ) .expect("vfs context should build"); diff --git a/engine/packages/epoxy/src/workflows/replica/setup.rs b/engine/packages/epoxy/src/workflows/replica/setup.rs index 2b78965a50..5c55178740 100644 --- a/engine/packages/epoxy/src/workflows/replica/setup.rs +++ b/engine/packages/epoxy/src/workflows/replica/setup.rs @@ -1,5 +1,5 @@ -use anyhow::{Context, Result}; -use epoxy_protocol::protocol::{self, ReplicaId}; +use anyhow::Result; +use epoxy_protocol::protocol; use futures_util::FutureExt; use gas::prelude::*; use rivet_api_builder::ApiCtx; @@ -100,84 +100,12 @@ struct CatchUpReplicaOutput { #[activity(CatchUpReplica)] async fn catch_up_replica( - ctx: &ActivityCtx, - input: &CatchUpReplicaInput, + _ctx: &ActivityCtx, + _input: &CatchUpReplicaInput, ) -> Result { - // TODO: No-op for now - return Ok(CatchUpReplicaOutput { + Ok(CatchUpReplicaOutput { last_versionstamp: None, applied_entries: 0, - }); - - let replica_id = ctx.config().epoxy_replica_id(); - let config: protocol::ClusterConfig = input.config.clone().into(); - let api_ctx = ApiCtx::new_from_activity(ctx)?; - let source_replica_id = config - .replicas - .iter() - .find(|replica| { - replica.replica_id != replica_id - && matches!(replica.status, protocol::ReplicaStatus::Active) - }) - .map(|replica| replica.replica_id); - - if source_replica_id.is_none() { - tracing::info!( - %replica_id, - "skipping changelog catch-up because the cluster has no active source replica yet" - ); - return Ok(CatchUpReplicaOutput { - last_versionstamp: None, - applied_entries: 0, - }); - } - let source_replica_id = source_replica_id.unwrap(); - - // Pre-cutover committed values are readable via local dual-read fallback immediately. They only - // become available to future learners after the background backfill populates the v2 changelog. - let response = read_changelog_page( - &api_ctx, - &config, - replica_id, - source_replica_id, - input.after_versionstamp.clone(), - ) - .await?; - - if response.entries.is_empty() { - return Ok(CatchUpReplicaOutput { - last_versionstamp: None, - applied_entries: 0, - }); - } - - let applied_entries = response.entries.len(); - let last_versionstamp = response.last_versionstamp.clone(); - for entry in response.entries { - ctx.udb()? - .run(|tx| { - let entry = entry.clone(); - async move { - crate::replica::changelog::apply_entry( - &*tx, replica_id, entry, true, false, false, - ) - .await - } - }) - .custom_instrument(tracing::info_span!("apply_changelog_entry_tx")) - .await?; - } - - tracing::info!( - %replica_id, - %source_replica_id, - applied_entries, - "applied changelog catch-up page" - ); - - Ok(CatchUpReplicaOutput { - last_versionstamp: Some(last_versionstamp), - applied_entries, }) } @@ -213,23 +141,3 @@ async fn notify_coordinator_replica_status( Ok(()) } - -#[tracing::instrument(skip_all, fields(%from_replica_id, %source_replica_id))] -async fn read_changelog_page( - api_ctx: &ApiCtx, - config: &protocol::ClusterConfig, - from_replica_id: ReplicaId, - source_replica_id: ReplicaId, - after_versionstamp: Option>, -) -> Result { - crate::http_client::read_changelog( - api_ctx, - config, - from_replica_id, - source_replica_id, - after_versionstamp, - crate::consts::CHANGELOG_READ_COUNT, - ) - .await - .with_context(|| format!("failed reading changelog page from replica {source_replica_id}")) -} diff --git a/engine/packages/guard-core/src/utils.rs b/engine/packages/guard-core/src/utils.rs index 5e1090d3df..348da297e5 100644 --- a/engine/packages/guard-core/src/utils.rs +++ b/engine/packages/guard-core/src/utils.rs @@ -226,7 +226,9 @@ pub(crate) fn should_retry_request(res: &Result>) -> bool } } -// Determine if a response should trigger a retry: 503 + x-rivet-error +// Determine if a response should trigger a retry. Guard-specific actor startup +// failures, including guard.actor_ready_timeout, are signaled as 503 with +// x-rivet-error and should be retried against a freshly resolved target. pub(crate) fn should_retry_request_inner(status: StatusCode, headers: &hyper::HeaderMap) -> bool { status == StatusCode::SERVICE_UNAVAILABLE && headers.contains_key(X_RIVET_ERROR) } @@ -294,3 +296,7 @@ pub(crate) fn to_hyper_close(frame: Option) -> hyper_tungstenite::tu )) } } + +#[cfg(test)] +#[path = "utils/tests.rs"] +mod tests; diff --git a/engine/packages/guard-core/src/utils/tests.rs b/engine/packages/guard-core/src/utils/tests.rs new file mode 100644 index 0000000000..17addf736a --- /dev/null +++ b/engine/packages/guard-core/src/utils/tests.rs @@ -0,0 +1,35 @@ +use hyper::header::HeaderValue; + +use super::*; + +#[test] +fn retries_guard_actor_ready_timeout_response() { + let mut headers = hyper::HeaderMap::new(); + headers.insert( + X_RIVET_ERROR, + HeaderValue::from_static("guard.actor_ready_timeout"), + ); + + assert!(should_retry_request_inner( + StatusCode::SERVICE_UNAVAILABLE, + &headers, + )); +} + +#[test] +fn skips_service_unavailable_without_rivet_error_header() { + let headers = hyper::HeaderMap::new(); + + assert!(!should_retry_request_inner( + StatusCode::SERVICE_UNAVAILABLE, + &headers, + )); +} + +#[test] +fn skips_non_service_unavailable_with_rivet_error_header() { + let mut headers = hyper::HeaderMap::new(); + headers.insert(X_RIVET_ERROR, HeaderValue::from_static("guard.no_route")); + + assert!(!should_retry_request_inner(StatusCode::NOT_FOUND, &headers)); +} diff --git a/engine/packages/guard/src/routing/actor_path.rs b/engine/packages/guard/src/routing/actor_path.rs index 9d9c8f6c66..d7b3aea443 100644 --- a/engine/packages/guard/src/routing/actor_path.rs +++ b/engine/packages/guard/src/routing/actor_path.rs @@ -229,10 +229,13 @@ fn extract_rvt_params(rvt_params: &[(String, String)]) -> Result { } .build()); } - map.insert( - stripped.to_string(), - serde_json::Value::String(value.clone()), - ); + let value = match stripped { + "bypass_connectable" => parse_query_bool(value) + .map(serde_json::Value::Bool) + .unwrap_or_else(|| serde_json::Value::String(value.clone())), + _ => serde_json::Value::String(value.clone()), + }; + map.insert(stripped.to_string(), value); } serde_json::from_value(serde_json::Value::Object(map)).map_err(|e| { @@ -243,6 +246,14 @@ fn extract_rvt_params(rvt_params: &[(String, String)]) -> Result { }) } +fn parse_query_bool(value: &str) -> Option { + match value { + "true" | "1" => Some(true), + "false" | "0" => Some(false), + _ => None, + } +} + /// Split a comma-separated key string into components. /// Missing or empty key yields an empty vec. fn split_key(raw: Option<&str>) -> Vec { diff --git a/engine/packages/universaldb/tests/rocksdb.rs b/engine/packages/universaldb/tests/rocksdb.rs index 46cdcc77e6..5a9e8401e6 100644 --- a/engine/packages/universaldb/tests/rocksdb.rs +++ b/engine/packages/universaldb/tests/rocksdb.rs @@ -159,13 +159,11 @@ async fn rocksdb_udb() { let mut chunk = Vec::with_capacity(100); loop { - let empty = if chunk.len() >= 100 { - false - } else if let Some(entry) = stream.try_next().await? { - chunk.push(entry); - continue; - } else { - true + if chunk.len() < 100 { + if let Some(entry) = stream.try_next().await? { + chunk.push(entry); + continue; + } }; let entry = match chunk.choose_weighted(&mut rand::thread_rng(), |_| 1) @@ -184,12 +182,6 @@ async fn rocksdb_udb() { tx.clear(entry.key()); return Ok(Some(entry.key().to_vec())); - - if empty { - break; - } else { - chunk.clear(); - } } Ok(None) diff --git a/engine/sdks/rust/envoy-client/tests/command_dedup.rs b/engine/sdks/rust/envoy-client/tests/command_dedup.rs index 8f3650331d..53ece8a5d7 100644 --- a/engine/sdks/rust/envoy-client/tests/command_dedup.rs +++ b/engine/sdks/rust/envoy-client/tests/command_dedup.rs @@ -91,6 +91,7 @@ fn new_envoy_context() -> EnvoyContext { envoy_key: "test-envoy".to_string(), envoy_tx, actors: Arc::new(std::sync::Mutex::new(HashMap::new())), + actors_notify: Arc::new(tokio::sync::Notify::new()), live_tunnel_requests: Arc::new(std::sync::Mutex::new(HashMap::new())), pending_hibernation_restores: Arc::new(std::sync::Mutex::new(HashMap::new())), ws_tx: Arc::new(tokio::sync::Mutex::new( diff --git a/examples/kitchen-sink/frontend/App.tsx b/examples/kitchen-sink/frontend/App.tsx index f21cf21401..9bc910a3bc 100644 --- a/examples/kitchen-sink/frontend/App.tsx +++ b/examples/kitchen-sink/frontend/App.tsx @@ -1,4 +1,5 @@ import { createRivetKit } from "@rivetkit/react"; +import { createClient } from "rivetkit/client"; import mermaid from "mermaid"; import { Highlight, themes } from "prism-react-renderer"; import { @@ -79,7 +80,8 @@ function MermaidDiagram({ chart }: { chart: string }) { } const rivetEndpoint = - import.meta.env.VITE_RIVET_ENDPOINT ?? "http://localhost:6420"; + import.meta.env.VITE_RIVET_ENDPOINT ?? + `${globalThis.location.origin}/api/rivet`; const { useActor } = createRivetKit(rivetEndpoint); @@ -242,6 +244,9 @@ function DemoPanel({ page }: { page: PageConfig }) { if (page.demo === "diagram") { return ; } + if (page.demo === "mock-agentic-loop") { + return ; + } if (page.actors.length === 0) { return ; } @@ -766,6 +771,1106 @@ function ActionRunner({ ); } +type AgenticEntry = { + request_id: string; + idx: number; + created_at: number; +}; + +type AgenticVerification = { + requestId: string; + expectedSeconds: number; + count: number; + contiguous?: boolean; + missing?: number[]; + indexes: number[]; + ok?: boolean; +}; + +type AgenticHistory = { + type: "history"; + totalRows: number; + entries: AgenticEntry[]; + timestamp: number; +}; + +type AgenticDebugEvent = { + type: "debugEvent"; + eventId: string; + name: string; + actorId: string; + connectionId: string | null; + requestId: string | null; + details: Record; + createdAt: number; + replayed: boolean; +}; + +type AgenticServerMessage = + | { type: "hello"; connectionId: string; timestamp: number } + | AgenticHistory + | AgenticDebugEvent + | { + type: "pong"; + probeId: string; + sleepStarted: boolean; + sleepStartedAt: number | null; + timestamp: number; + } + | { type: "started"; requestId: string; seconds: number; timestamp: number } + | { + type: "progress"; + requestId: string; + idx: number; + seconds: number; + createdAt: number; + } + | { + type: "done"; + requestId: string; + seconds: number; + timestamp: number; + verification: AgenticVerification; + } + | (AgenticVerification & { type: "verified" }) + | { type: "error"; message: string; timestamp: number }; + +type AgenticRequest = { + requestId: string; + seconds: number; +}; + +type AgenticHandle = { + resolve: () => Promise; + webSocket: ( + path?: string, + protocols?: string | string[], + options?: { + gateway?: { bypassConnectable?: boolean }; + }, + ) => Promise; + fetch: ( + input: string, + init?: RequestInit & { + gateway?: { bypassConnectable?: boolean }; + }, + ) => Promise; + verify: ( + requestId: string, + expectedSeconds: number, + ) => Promise; + verifyAll: (expectedRequests: AgenticRequest[]) => Promise<{ + type: "verifiedAll"; + expectedRequests: number; + expectedTotalRows: number; + totalRows: number; + unexpectedRequestIds: string[]; + requests: AgenticVerification[]; + ok: boolean; + }>; +}; + +type ActiveAgenticRequest = { + requestId: string; + seconds: number; + expectedIdx: number; + received: number[]; + lastProgressAt: number; + startedAt: number; +}; + +type AgenticLogEntry = { + id: string; + level: "ok" | "warn" | "error" | "info"; + message: string; + time: string; +}; + +function randomAgenticKey() { + return `manual-agentic-${new Date().toISOString()}-${crypto.randomUUID()}`; +} + +function nowTime() { + return new Date().toLocaleTimeString(); +} + +function appendEndpointPath(endpoint: string, path: string): URL { + const url = new URL(endpoint); + const prefix = url.pathname.replace(/\/$/, ""); + url.pathname = `${prefix}${path}`; + url.search = ""; + url.hash = ""; + return url; +} + +function waitForSocketOpen(socket: WebSocket, timeoutMs = 10_000) { + if (socket.readyState === WebSocket.OPEN) return Promise.resolve(); + + return new Promise((resolve, reject) => { + const timeout = setTimeout(() => { + cleanup(); + reject(new Error(`websocket open timed out after ${timeoutMs}ms`)); + }, timeoutMs); + const cleanup = () => { + clearTimeout(timeout); + socket.removeEventListener("open", onOpen); + socket.removeEventListener("close", onClose); + socket.removeEventListener("error", onError); + }; + const onOpen = () => { + cleanup(); + resolve(); + }; + const onClose = (event: CloseEvent) => { + cleanup(); + reject( + new Error( + `websocket closed before open code=${event.code} reason=${event.reason}`, + ), + ); + }; + const onError = () => { + cleanup(); + reject(new Error("websocket open error")); + }; + socket.addEventListener("open", onOpen, { once: true }); + socket.addEventListener("close", onClose, { once: true }); + socket.addEventListener("error", onError, { once: true }); + }); +} + +function validateAgenticRows( + entries: AgenticEntry[], + expectedRequests: AgenticRequest[], + activeRequest?: ActiveAgenticRequest | null, +) { + const expectedByRequest = new Map( + expectedRequests.map((request) => [request.requestId, request.seconds]), + ); + const rowsByRequest = new Map(); + + for (const entry of entries) { + const rows = rowsByRequest.get(entry.request_id) ?? []; + rows.push(entry); + rowsByRequest.set(entry.request_id, rows); + } + + const problems: string[] = []; + for (const request of expectedRequests) { + const rows = rowsByRequest.get(request.requestId) ?? []; + const indexes = rows.map((row) => row.idx); + const contiguous = + rows.length === request.seconds && + indexes.every((idx, offset) => idx === offset + 1); + if (!contiguous) { + problems.push( + `${request.requestId.slice(0, 8)} expected ${request.seconds}, got [${indexes.join(", ")}]`, + ); + } + } + + if (activeRequest) { + const rows = rowsByRequest.get(activeRequest.requestId) ?? []; + const indexes = rows.map((row) => row.idx); + const contiguousPrefix = indexes.every( + (idx, offset) => idx === offset + 1, + ); + if (rows.length > activeRequest.seconds || !contiguousPrefix) { + problems.push( + `${activeRequest.requestId.slice(0, 8)} active request expected partial 1-${activeRequest.seconds}, got [${indexes.join(", ")}]`, + ); + } + } + + for (const requestId of rowsByRequest.keys()) { + if ( + !expectedByRequest.has(requestId) && + requestId !== activeRequest?.requestId + ) { + problems.push(`${requestId.slice(0, 8)} was not expected`); + } + } + + return { + ok: problems.length === 0, + problems, + expectedRows: expectedRequests.reduce( + (total, request) => total + request.seconds, + 0, + ) + (activeRequest?.received.length ?? 0), + }; + } + +function sleepStatusFromPayload( + source: string, + payload: { sleepStarted?: unknown; sleepStartedAt?: unknown }, +) { + if (typeof payload.sleepStarted !== "boolean") { + throw new Error(`${source} missing boolean sleepStarted`); + } + if (payload.sleepStarted && typeof payload.sleepStartedAt !== "number") { + throw new Error(`${source} missing numeric sleepStartedAt`); + } + if (!payload.sleepStarted && payload.sleepStartedAt !== null) { + throw new Error(`${source} expected null sleepStartedAt before sleep`); + } + return { + sleepStarted: payload.sleepStarted, + sleepStartedAt: payload.sleepStartedAt, + }; +} + +function formatDebugDetails(details: Record) { + const entries = Object.entries(details).filter( + ([, value]) => value !== undefined && value !== null, + ); + if (entries.length === 0) return ""; + + return ` ${entries + .map(([key, value]) => `${key}=${String(value)}`) + .join(" ")}`; +} + +function formatAgenticDebugEvent(event: AgenticDebugEvent) { + const actorTime = new Date(event.createdAt).toLocaleTimeString(); + const lagMs = Date.now() - event.createdAt; + const connection = event.connectionId + ? ` conn=${event.connectionId.slice(0, 8)}` + : ""; + const request = event.requestId + ? ` req=${event.requestId.slice(0, 8)}` + : ""; + const replay = event.replayed ? " replay" : ""; + + return `actor${replay} ${event.name} at ${actorTime} lagMs=${lagMs}${connection}${request}${formatDebugDetails(event.details)}`; +} + +function MockAgenticLoopPanel({ page }: { page: PageConfig }) { + const [endpoint, setEndpoint] = usePersistedState( + "kitchen-sink:mock-agentic-loop:endpoint", + rivetEndpoint, + ); + const [namespace, setNamespace] = usePersistedState( + "kitchen-sink:mock-agentic-loop:namespace", + "default", + ); + const [token, setToken] = usePersistedState( + "kitchen-sink:mock-agentic-loop:token", + "dev", + ); + const [key, setKey] = useState(randomAgenticKey); + const [actorId, setActorId] = useState(""); + const [connectionStatus, setConnectionStatus] = useState("idle"); + const [seconds, setSeconds] = useState(16); + const [progressMarginMs, setProgressMarginMs] = useState(8_000); + const [currentRequest, setCurrentRequest] = useState<{ + requestId: string; + seconds: number; + received: number[]; + } | null>(null); + const [expectedRequests, setExpectedRequests] = useState([]); + const [lastVerification, setLastVerification] = useState("No requests yet."); + const [lastHistory, setLastHistory] = useState("No history loaded yet."); + const [lastBypass, setLastBypass] = useState("No bypass requests yet."); + const [isConnecting, setIsConnecting] = useState(false); + const [isRunningInference, setIsRunningInference] = useState(false); + const [stats, setStats] = useState({ + requests: 0, + expectedRows: 0, + actualRows: 0, + reconnects: 0, + maxReconnectMs: 0, + sleepPosts: 0, + sleepErrors: 0, + bypassHttpOk: 0, + bypassWsOk: 0, + actorStopping: 0, + sleepProofHttp: 0, + sleepProofWs: 0, + validationErrors: 0, + }); + const [logs, setLogs] = useState([]); + + const handleRef = useRef(null); + const socketRef = useRef(null); + const expectedRequestsRef = useRef([]); + const activeRequestRef = useRef(null); + const reconnectTimerRef = useRef | null>(null); + const progressTimerRef = useRef | null>(null); + const reconnectStartedAtRef = useRef(null); + const mainSocketCleanupRef = useRef<(() => void) | null>(null); + const closedByUserRef = useRef(false); + + const addLog = useCallback( + (level: AgenticLogEntry["level"], message: string) => { + setLogs((prev) => [ + { + id: crypto.randomUUID(), + level, + message, + time: nowTime(), + }, + ...prev.slice(0, 159), + ]); + }, + [], + ); + + const clearProgressTimer = useCallback(() => { + if (progressTimerRef.current) { + clearTimeout(progressTimerRef.current); + progressTimerRef.current = null; + } + }, []); + + const markValidationError = useCallback((message: string) => { + setStats((prev) => ({ + ...prev, + validationErrors: prev.validationErrors + 1, + })); + setLastVerification(message); + addLog("error", message); + }, [addLog]); + + const scheduleProgressTimeout = useCallback(() => { + clearProgressTimer(); + const active = activeRequestRef.current; + if (!active) return; + const timeoutMs = 1_000 + progressMarginMs; + progressTimerRef.current = setTimeout(() => { + const latest = activeRequestRef.current; + if (!latest) return; + markValidationError( + `progress timeout for ${latest.requestId.slice(0, 8)} at idx=${latest.expectedIdx}`, + ); + }, timeoutMs); + }, [clearProgressTimer, markValidationError, progressMarginMs]); + + const resetSession = useCallback(() => { + closedByUserRef.current = true; + if (reconnectTimerRef.current) clearTimeout(reconnectTimerRef.current); + clearProgressTimer(); + mainSocketCleanupRef.current?.(); + mainSocketCleanupRef.current = null; + socketRef.current?.close(1000, "new actor"); + socketRef.current = null; + handleRef.current = null; + expectedRequestsRef.current = []; + activeRequestRef.current = null; + setKey(randomAgenticKey()); + setActorId(""); + setConnectionStatus("idle"); + setCurrentRequest(null); + setExpectedRequests([]); + setIsRunningInference(false); + setLastVerification("No requests yet."); + setLastHistory("No history loaded yet."); + setLastBypass("No bypass requests yet."); + setStats({ + requests: 0, + expectedRows: 0, + actualRows: 0, + reconnects: 0, + maxReconnectMs: 0, + sleepPosts: 0, + sleepErrors: 0, + bypassHttpOk: 0, + bypassWsOk: 0, + actorStopping: 0, + sleepProofHttp: 0, + sleepProofWs: 0, + validationErrors: 0, + }); + setLogs([]); + }, [clearProgressTimer]); + + const requestHistory = useCallback(() => { + if (socketRef.current?.readyState !== WebSocket.OPEN) return; + socketRef.current.send(JSON.stringify({ type: "history" })); + addLog("info", "history requested"); + }, [addLog]); + + const verifyAll = useCallback(async () => { + const handle = handleRef.current; + if (!handle) return; + const result = await handle.verifyAll(expectedRequestsRef.current); + if (!result.ok) { + markValidationError(`aggregate verification failed: ${formatJson(result)}`); + return; + } + setStats((prev) => ({ + ...prev, + actualRows: result.totalRows, + expectedRows: result.expectedTotalRows, + })); + addLog( + "ok", + `verified all requests=${result.expectedRequests} rows=${result.totalRows}`, + ); + }, [addLog, markValidationError]); + + const handleHistory = useCallback((message: AgenticHistory) => { + const validation = validateAgenticRows( + message.entries, + expectedRequestsRef.current, + activeRequestRef.current, + ); + setStats((prev) => ({ + ...prev, + actualRows: message.totalRows, + expectedRows: validation.expectedRows, + validationErrors: validation.ok + ? prev.validationErrors + : prev.validationErrors + 1, + })); + if (validation.ok) { + setLastHistory( + `history ok: rows=${message.totalRows}, expected=${validation.expectedRows}`, + ); + addLog("ok", `history rows=${message.totalRows}`); + } else { + const text = `history mismatch: ${validation.problems.join("; ")}`; + setLastHistory(text); + addLog("error", text); + } + }, [addLog]); + + const handleProgress = useCallback((message: Extract) => { + const active = activeRequestRef.current; + if (!active || active.requestId !== message.requestId) { + markValidationError(`unexpected progress for ${message.requestId.slice(0, 8)}`); + return; + } + const now = performance.now(); + const gapMs = now - active.lastProgressAt; + if (message.idx !== active.expectedIdx) { + markValidationError( + `expected idx=${active.expectedIdx}, got idx=${message.idx}`, + ); + } + active.received.push(message.idx); + active.expectedIdx += 1; + active.lastProgressAt = now; + setCurrentRequest({ + requestId: active.requestId, + seconds: active.seconds, + received: [...active.received], + }); + addLog( + "info", + `progress ${message.idx}/${message.seconds} gapMs=${gapMs.toFixed(0)}`, + ); + scheduleProgressTimeout(); + }, [addLog, markValidationError, scheduleProgressTimeout]); + + const handleDone = useCallback(async (message: Extract) => { + const active = activeRequestRef.current; + clearProgressTimer(); + setIsRunningInference(false); + activeRequestRef.current = null; + + if (!active || active.requestId !== message.requestId) { + markValidationError(`unexpected done for ${message.requestId.slice(0, 8)}`); + return; + } + + const contiguous = + active.received.length === active.seconds && + active.received.every((idx, offset) => idx === offset + 1); + if (!contiguous || !message.verification.ok) { + markValidationError( + `done verification failed: stream=[${active.received.join(", ")}], actor=${formatJson(message.verification)}`, + ); + return; + } + + const handle = handleRef.current; + if (handle) { + const explicit = await handle.verify(active.requestId, active.seconds); + const explicitOk = + explicit.count === active.seconds && + explicit.indexes.every((idx, offset) => idx === offset + 1); + if (!explicitOk) { + markValidationError( + `action verification failed: ${formatJson(explicit)}`, + ); + return; + } + } + + const completed = { + requestId: active.requestId, + seconds: active.seconds, + }; + expectedRequestsRef.current = [...expectedRequestsRef.current, completed]; + setExpectedRequests(expectedRequestsRef.current); + setStats((prev) => ({ + ...prev, + requests: prev.requests + 1, + expectedRows: prev.expectedRows + active.seconds, + })); + setLastVerification( + `request ${active.requestId.slice(0, 8)} ok: ${active.seconds}/${active.seconds} rows`, + ); + addLog( + "ok", + `done ${active.requestId.slice(0, 8)} rows=${active.seconds}`, + ); + await verifyAll(); + requestHistory(); + }, [addLog, clearProgressTimer, markValidationError, requestHistory, verifyAll]); + + const onSocketMessage = useCallback((event: MessageEvent) => { + if (typeof event.data !== "string") return; + const message = JSON.parse(event.data) as AgenticServerMessage; + if (message.type === "hello") { + addLog("ok", `main ws hello ${message.connectionId.slice(0, 8)}`); + return; + } + if (message.type === "history") { + handleHistory(message); + return; + } + if (message.type === "debugEvent") { + const level = + message.name === "onSleepStart" || message.name === "webSocketClose" + ? "warn" + : "info"; + addLog(level, formatAgenticDebugEvent(message)); + return; + } + if (message.type === "started") { + addLog("ok", `started ${message.requestId.slice(0, 8)} seconds=${message.seconds}`); + return; + } + if (message.type === "progress") { + handleProgress(message); + return; + } + if (message.type === "done") { + void handleDone(message); + return; + } + if (message.type === "error") { + markValidationError(`actor error: ${message.message}`); + } + }, [addLog, handleDone, handleHistory, handleProgress, markValidationError]); + + const connect = useCallback(async (countReconnect = false) => { + if (isConnecting) return; + setIsConnecting(true); + setConnectionStatus("connecting"); + closedByUserRef.current = false; + const startedAt = performance.now(); + + try { + const client = createClient({ + endpoint, + namespace, + token, + encoding: "json", + }); + const handle = client.mockAgenticLoop.getOrCreate([key]) as AgenticHandle; + handleRef.current = handle; + const resolvedActorId = await handle.resolve(); + setActorId(resolvedActorId); + + const socket = await handle.webSocket(); + await waitForSocketOpen(socket); + socketRef.current = socket; + const onClose = (event: CloseEvent) => { + if (socketRef.current === socket) socketRef.current = null; + setConnectionStatus("closed"); + const closedLocally = closedByUserRef.current; + addLog( + closedLocally ? "info" : "warn", + `${closedLocally ? "local" : "remote"} main ws close code=${event.code} reason=${event.reason}`, + ); + if (!closedLocally) { + reconnectStartedAtRef.current = performance.now(); + reconnectTimerRef.current = setTimeout(() => { + void connect(true); + }, 500); + } + }; + const onError = () => { + addLog("error", "main ws error"); + }; + socket.addEventListener("message", onSocketMessage); + socket.addEventListener("close", onClose); + socket.addEventListener("error", onError); + mainSocketCleanupRef.current = () => { + socket.removeEventListener("message", onSocketMessage); + socket.removeEventListener("close", onClose); + socket.removeEventListener("error", onError); + }; + + const elapsedMs = performance.now() - startedAt; + setConnectionStatus("connected"); + if (countReconnect || reconnectStartedAtRef.current !== null) { + const reconnectMs = reconnectStartedAtRef.current === null + ? elapsedMs + : performance.now() - reconnectStartedAtRef.current; + reconnectStartedAtRef.current = null; + setStats((prev) => ({ + ...prev, + reconnects: prev.reconnects + 1, + maxReconnectMs: Math.max(prev.maxReconnectMs, reconnectMs), + })); + addLog("ok", `reconnected in ${reconnectMs.toFixed(0)}ms`); + } else { + addLog("ok", `connected actor=${resolvedActorId}`); + } + requestHistory(); + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + setConnectionStatus("error"); + addLog("error", `connect failed: ${message}`); + } finally { + setIsConnecting(false); + } + }, [addLog, endpoint, isConnecting, key, namespace, onSocketMessage, requestHistory, token]); + + const disconnect = useCallback(() => { + closedByUserRef.current = true; + if (reconnectTimerRef.current) clearTimeout(reconnectTimerRef.current); + clearProgressTimer(); + mainSocketCleanupRef.current?.(); + mainSocketCleanupRef.current = null; + socketRef.current?.close(1000, "manual disconnect"); + socketRef.current = null; + setConnectionStatus("closed"); + addLog("warn", "main ws disconnected by client"); + }, [addLog, clearProgressTimer]); + + const runInference = useCallback(() => { + const socket = socketRef.current; + if (!socket || socket.readyState !== WebSocket.OPEN) { + addLog("error", "main websocket is not connected"); + return; + } + if (activeRequestRef.current) { + addLog("warn", "inference already active"); + return; + } + const safeSeconds = Math.max(1, Math.floor(seconds)); + const requestId = crypto.randomUUID(); + activeRequestRef.current = { + requestId, + seconds: safeSeconds, + expectedIdx: 1, + received: [], + lastProgressAt: performance.now(), + startedAt: performance.now(), + }; + setCurrentRequest({ requestId, seconds: safeSeconds, received: [] }); + setIsRunningInference(true); + socket.send(JSON.stringify({ type: "infer", requestId, seconds: safeSeconds })); + addLog("info", `infer ${requestId.slice(0, 8)} seconds=${safeSeconds}`); + scheduleProgressTimeout(); + }, [addLog, scheduleProgressTimeout, seconds]); + + const forceSleep = useCallback(async () => { + if (!actorId) { + addLog("error", "resolve an actor before forcing sleep"); + return; + } + const url = appendEndpointPath( + endpoint, + `/actors/${encodeURIComponent(actorId)}/sleep`, + ); + url.searchParams.set("namespace", namespace); + setStats((prev) => ({ ...prev, sleepPosts: prev.sleepPosts + 1 })); + addLog("warn", "sleep post sent"); + try { + const response = await fetch(url, { + method: "POST", + headers: { + Authorization: token ? `Bearer ${token}` : "", + "content-type": "application/json", + }, + body: "{}", + }); + const text = await response.text(); + if (!response.ok) { + setStats((prev) => ({ ...prev, sleepErrors: prev.sleepErrors + 1 })); + addLog("error", `sleep ${response.status}: ${text}`); + return; + } + addLog("ok", `sleep ${response.status}`); + } catch (error) { + setStats((prev) => ({ ...prev, sleepErrors: prev.sleepErrors + 1 })); + addLog("error", `sleep failed: ${error instanceof Error ? error.message : String(error)}`); + } + }, [actorId, addLog, endpoint, namespace, token]); + + const noteActorStopping = useCallback((label: string, status: number, text: string) => { + setStats((prev) => ({ ...prev, actorStopping: prev.actorStopping + 1 })); + setLastBypass(`${label}: actor.stopping (${status})`); + addLog("warn", `${label} actor.stopping ${text}`); + }, [addLog]); + + const testHttpBypass = useCallback(async () => { + const handle = handleRef.current; + if (!handle) { + addLog("error", "connect before testing bypass"); + return; + } + try { + const response = await handle.fetch("/bypass", { + gateway: { bypassConnectable: true }, + }); + const text = await response.text(); + if (!response.ok) { + if (text.includes('"code":"stopping"')) { + noteActorStopping("http bypass", response.status, text); + return; + } + setLastBypass(`http bypass failed ${response.status}: ${text}`); + addLog("error", `http bypass ${response.status}: ${text}`); + return; + } + const payload = JSON.parse(text) as { + type?: string; + transport?: string; + sleepStarted?: unknown; + sleepStartedAt?: unknown; + }; + const sleepStatus = sleepStatusFromPayload("http bypass", payload); + if (payload.type !== "bypass" || payload.transport !== "http") { + throw new Error(`unexpected body ${text}`); + } + setStats((prev) => ({ + ...prev, + bypassHttpOk: prev.bypassHttpOk + 1, + sleepProofHttp: + prev.sleepProofHttp + (sleepStatus.sleepStarted ? 1 : 0), + })); + setLastBypass( + `http bypass ok: sleepStarted=${sleepStatus.sleepStarted}`, + ); + addLog("ok", `http bypass sleepStarted=${sleepStatus.sleepStarted}`); + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + setLastBypass(`http bypass error: ${message}`); + addLog("error", `http bypass error: ${message}`); + } + }, [addLog, noteActorStopping]); + + const testWebSocketBypass = useCallback(async () => { + const handle = handleRef.current; + if (!handle) { + addLog("error", "connect before testing bypass"); + return; + } + const probeId = crypto.randomUUID(); + let socket: WebSocket | null = null; + try { + socket = await handle.webSocket("/bypass", undefined, { + gateway: { bypassConnectable: true }, + }); + await waitForSocketOpen(socket); + const result = await new Promise>( + (resolve, reject) => { + const timeout = setTimeout(() => { + cleanup(); + reject(new Error("timed out waiting for bypass pong")); + }, 10_000); + const cleanup = () => { + clearTimeout(timeout); + socket?.removeEventListener("message", onMessage); + socket?.removeEventListener("close", onClose); + socket?.removeEventListener("error", onError); + }; + const onMessage = (event: MessageEvent) => { + if (typeof event.data !== "string") return; + const message = JSON.parse(event.data) as AgenticServerMessage; + if (message.type !== "pong" || message.probeId !== probeId) return; + cleanup(); + resolve(message); + }; + const onClose = (event: CloseEvent) => { + cleanup(); + reject( + new Error(`closed code=${event.code} reason=${event.reason}`), + ); + }; + const onError = () => { + cleanup(); + reject(new Error("websocket error")); + }; + socket?.addEventListener("message", onMessage); + socket?.addEventListener("close", onClose, { once: true }); + socket?.addEventListener("error", onError, { once: true }); + socket?.send(JSON.stringify({ type: "ping", probeId })); + }, + ); + const sleepStatus = sleepStatusFromPayload("ws bypass", result); + setStats((prev) => ({ + ...prev, + bypassWsOk: prev.bypassWsOk + 1, + sleepProofWs: prev.sleepProofWs + (sleepStatus.sleepStarted ? 1 : 0), + })); + setLastBypass(`ws bypass ok: sleepStarted=${sleepStatus.sleepStarted}`); + addLog("ok", `ws bypass sleepStarted=${sleepStatus.sleepStarted}`); + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + if (message.includes("actor.stopping") || message.includes("Server Error")) { + setStats((prev) => ({ ...prev, actorStopping: prev.actorStopping + 1 })); + setLastBypass(`ws bypass transient close: ${message}`); + addLog("warn", `ws bypass transient close: ${message}`); + } else { + setLastBypass(`ws bypass error: ${message}`); + addLog("error", `ws bypass error: ${message}`); + } + } finally { + if ( + socket && + (socket.readyState === WebSocket.OPEN || + socket.readyState === WebSocket.CONNECTING) + ) { + socket.close(1000, "bypass probe complete"); + } + } + }, [addLog]); + + useEffect(() => { + return () => { + if (reconnectTimerRef.current) clearTimeout(reconnectTimerRef.current); + clearProgressTimer(); + mainSocketCleanupRef.current?.(); + mainSocketCleanupRef.current = null; + }; + }, [clearProgressTimer]); + + const currentIndexes = currentRequest?.received ?? []; + const invariantStatus = + stats.validationErrors === 0 ? "pass" : "fail"; + + return ( +
+
+
+
+

Mock Agentic Loop

+

+ Use one raw WebSocket stream, explicit actions, manual sleep, and + gateway bypass calls against the same actor. +

+
+
+ {connectionStatus} +
+
+ +
+
+ + setEndpoint(event.target.value)} + /> +
+
+ + setNamespace(event.target.value)} + /> +
+
+ + setToken(event.target.value)} + /> +
+
+ +
+
+
Key
+
{key}
+
+
+
Actor ID
+
{actorId || "not resolved"}
+
+
+ +
+ + + +
+
+ +
+
+

Inference

+
+ {stats.validationErrors === 0 ? "valid" : "invalid"} +
+
+
+
+ + setSeconds(Number(event.target.value))} + /> +
+
+ + setProgressMarginMs(Number(event.target.value))} + /> +
+
+
+ + +
+
+ {currentRequest ? ( + <> +
+ {currentRequest.requestId.slice(0, 8)} received{" "} + {currentIndexes.length}/{currentRequest.seconds} +
+
+ {Array.from({ length: currentRequest.seconds }, (_, index) => { + const idx = index + 1; + const received = currentIndexes.includes(idx); + return ( + + {idx} + + ); + })} +
+ + ) : ( +
No active inference.
+ )} +
+
+ +
+
+

Sleep and Bypass

+
+
+ + + +
+
{lastBypass}
+
+ +
+
+

Event Log

+ +
+
+ {logs.length === 0 ? ( +
No activity yet.
+ ) : ( + logs.map((entry) => ( +
+ {entry.time} + {entry.message} +
+ )) + )} +
+
+ +
+
+

Validation

+
+
+ + + + + + + + + + + + +
+
{lastVerification}
+
{lastHistory}
+
+ +
+
+ Source +
+ +
+
+ ); +} + +function AgenticStat({ + label, + value, +}: { + label: string; + value: string | number; +}) { + return ( +
+ {label} + {value} +
+ ); +} + // ── Welcome / Diagram / Config ──────────────────── function WelcomePanel() { diff --git a/examples/kitchen-sink/frontend/page-data.ts b/examples/kitchen-sink/frontend/page-data.ts index a5b9c3fc2f..aafaedb8e6 100644 --- a/examples/kitchen-sink/frontend/page-data.ts +++ b/examples/kitchen-sink/frontend/page-data.ts @@ -10,7 +10,13 @@ export type ActionTemplate = { description?: string; }; -export type DemoType = "actions" | "config" | "diagram" | "raw-http" | "raw-websocket"; +export type DemoType = + | "actions" + | "config" + | "diagram" + | "mock-agentic-loop" + | "raw-http" + | "raw-websocket"; export type PageConfig = { id: string; @@ -182,6 +188,19 @@ await actor.runCycle({ rowBytes: 16384, deleteRows: 64, retainRows: 1024, +});`, + mockAgenticLoop: `const client = createClient({ endpoint, encoding: "json" }); +const actor = client.mockAgenticLoop.getOrCreate([key]); +const ws = await actor.webSocket(); + +ws.send(JSON.stringify({ type: "infer", requestId, seconds })); + +await actor.fetch("/bypass", { + gateway: { bypassConnectable: true }, +}); + +await actor.webSocket("/bypass", undefined, { + gateway: { bypassConnectable: true }, });`, }; @@ -1282,6 +1301,16 @@ export const PAGE_GROUPS: PageGroup[] = [ T -->|action| A A -->|result| T`, }, + { + id: "mock-agentic-loop", + title: "Mock Agentic Loop", + description: + "Manually test streaming, SQLite durability, forced sleep, reconnects, and gateway bypass against one actor.", + docs: [], + actors: ["mockAgenticLoop"], + snippet: SNIPPETS.mockAgenticLoop, + demo: "mock-agentic-loop", + }, { id: "sqlite-memory-pressure", title: "SQLite Memory Pressure", diff --git a/examples/kitchen-sink/index.html b/examples/kitchen-sink/index.html index f96a17491d..f14c3cc97e 100644 --- a/examples/kitchen-sink/index.html +++ b/examples/kitchen-sink/index.html @@ -712,6 +712,214 @@ color: var(--muted); } + /* ── Mock Agentic Loop Lab ───────────── */ + + .agentic-lab { + display: grid; + grid-template-columns: repeat(2, minmax(0, 1fr)); + gap: 16px; + } + + .agentic-lab .demo-code-bottom { + grid-column: 1 / -1; + border: 1px solid var(--border-strong); + border-radius: var(--radius); + overflow: hidden; + background: var(--panel); + } + + .agentic-panel { + background: var(--panel); + border: 1px solid var(--border-strong); + border-radius: var(--radius); + padding: 16px; + display: flex; + flex-direction: column; + gap: 14px; + min-width: 0; + } + + .agentic-panel-header { + display: flex; + align-items: flex-start; + justify-content: space-between; + gap: 12px; + } + + .agentic-grid { + display: grid; + grid-template-columns: repeat(3, minmax(0, 1fr)); + gap: 10px; + } + + .agentic-grid.compact { + grid-template-columns: repeat(2, minmax(0, 1fr)); + } + + .agentic-session-row { + display: grid; + grid-template-columns: 1fr 1fr; + gap: 12px; + padding: 12px; + background: var(--panel-3); + border: 1px solid var(--border); + border-radius: 8px; + min-width: 0; + } + + .agentic-kicker { + font-size: 10px; + text-transform: uppercase; + letter-spacing: 0.08em; + color: var(--muted-2); + margin-bottom: 4px; + } + + .agentic-mono, + .agentic-result, + .agentic-empty { + font-family: ui-monospace, SFMono-Regular, "SF Mono", Consolas, monospace; + font-size: 12px; + line-height: 1.5; + overflow-wrap: anywhere; + } + + .agentic-result, + .agentic-empty { + background: var(--panel-3); + border: 1px solid var(--border); + border-radius: 8px; + padding: 10px 12px; + color: var(--muted); + } + + .agentic-status { + flex-shrink: 0; + border: 1px solid var(--border-strong); + border-radius: 999px; + padding: 4px 10px; + font-size: 12px; + color: var(--muted); + background: var(--panel-3); + } + + .agentic-status.connected, + .agentic-status.pass { + color: var(--success); + border-color: rgba(48, 209, 88, 0.45); + } + + .agentic-status.connecting { + color: var(--warning); + border-color: rgba(255, 159, 10, 0.5); + } + + .agentic-status.error, + .agentic-status.fail { + color: var(--danger); + border-color: rgba(255, 59, 48, 0.5); + } + + .agentic-stream { + min-height: 112px; + background: var(--panel-3); + border: 1px solid var(--border); + border-radius: 8px; + padding: 12px; + } + + .agentic-indexes { + display: grid; + grid-template-columns: repeat(auto-fill, minmax(34px, 1fr)); + gap: 6px; + margin-top: 10px; + } + + .agentic-index { + height: 28px; + display: inline-flex; + align-items: center; + justify-content: center; + border: 1px solid var(--border-strong); + border-radius: 6px; + color: var(--muted); + font-size: 12px; + font-family: ui-monospace, SFMono-Regular, "SF Mono", Consolas, monospace; + } + + .agentic-index.received { + color: var(--success); + border-color: rgba(48, 209, 88, 0.45); + background: rgba(48, 209, 88, 0.1); + } + + .agentic-stat-grid { + display: grid; + grid-template-columns: repeat(4, minmax(0, 1fr)); + gap: 8px; + } + + .agentic-stat { + background: var(--panel-3); + border: 1px solid var(--border); + border-radius: 8px; + padding: 10px; + min-height: 66px; + display: flex; + flex-direction: column; + justify-content: space-between; + gap: 8px; + } + + .agentic-stat span { + color: var(--muted-2); + font-size: 11px; + } + + .agentic-stat strong { + font-size: 16px; + font-family: ui-monospace, SFMono-Regular, "SF Mono", Consolas, monospace; + overflow-wrap: anywhere; + } + + .agentic-log { + background: var(--panel-3); + border: 1px solid var(--border); + border-radius: 8px; + max-height: 360px; + overflow-y: auto; + font-family: ui-monospace, SFMono-Regular, "SF Mono", Consolas, monospace; + font-size: 12px; + } + + .agentic-log-row { + display: grid; + grid-template-columns: 88px 1fr; + gap: 8px; + padding: 8px 10px; + border-bottom: 1px solid var(--border); + } + + .agentic-log-row:last-child { + border-bottom: none; + } + + .agentic-log-row span:first-child { + color: var(--muted-2); + } + + .agentic-log-row.ok span:last-child { + color: var(--success); + } + + .agentic-log-row.warn span:last-child { + color: var(--warning); + } + + .agentic-log-row.error span:last-child { + color: var(--danger); + } + /* ── Mermaid ──────────────────────────── */ .mermaid-diagram { @@ -734,6 +942,15 @@ .actor-columns { grid-template-columns: 1fr; } + .agentic-lab { + grid-template-columns: 1fr; + } + .agentic-grid, + .agentic-grid.compact, + .agentic-session-row, + .agentic-stat-grid { + grid-template-columns: 1fr; + } .actor-controls { border-right: none; border-bottom: 1px solid var(--border-strong); diff --git a/examples/kitchen-sink/scripts/mock-agentic-loop.ts b/examples/kitchen-sink/scripts/mock-agentic-loop.ts index ed43180c4c..4f40455daf 100644 --- a/examples/kitchen-sink/scripts/mock-agentic-loop.ts +++ b/examples/kitchen-sink/scripts/mock-agentic-loop.ts @@ -61,9 +61,14 @@ const MAX_RECONNECT_MS = numberFromEnv( "MOCK_AGENTIC_MAX_RECONNECT_MS", 30_000, ); +const DEFAULT_ON_SLEEP_DELAY_MS = 15_000; +const ON_SLEEP_DELAY_MS = numberFromEnv( + "MOCK_AGENTIC_ON_SLEEP_DELAY_MS", + DEFAULT_ON_SLEEP_DELAY_MS, +); const SLEEP_CLOSE_TIMEOUT_MS = numberFromEnv( "MOCK_AGENTIC_SLEEP_CLOSE_TIMEOUT_MS", - 20_000, + ON_SLEEP_DELAY_MS + 30_000, ); const PROBE_INTERVAL_MS = numberFromEnv( "MOCK_AGENTIC_PROBE_INTERVAL_MS", @@ -97,7 +102,13 @@ type ServerMessage = entries: HistoryEntry[]; timestamp: number; } - | { type: "pong"; probeId: string; timestamp: number } + | { + type: "pong"; + probeId: string; + sleepStarted: boolean; + sleepStartedAt: number | null; + timestamp: number; + } | { type: "started"; requestId: string; seconds: number; timestamp: number } | { type: "progress"; @@ -207,9 +218,13 @@ type BypassStats = { httpSuccesses: number; beforeSleepHttpSuccesses: number; afterSleepHttpSuccesses: number; + beforeSleepHttpUnexpectedSleepStarted: number; + afterSleepHttpSleepStarted: number; webSocketSuccesses: number; beforeSleepWebSocketSuccesses: number; afterSleepWebSocketSuccesses: number; + beforeSleepWebSocketUnexpectedSleepStarted: number; + afterSleepWebSocketSleepStarted: number; timeouts: BypassObservation[]; errors: BypassObservation[]; }; @@ -219,7 +234,7 @@ type BypassHandle = { input: string, init?: RequestInit & { gateway?: { - bypassConnectable?: boolean; + skipReadyWait?: boolean; }; }, ) => Promise; @@ -228,7 +243,7 @@ type BypassHandle = { protocols?: string | string[], options?: { gateway?: { - bypassConnectable?: boolean; + skipReadyWait?: boolean; }; }, ) => Promise; @@ -520,6 +535,8 @@ async function startLocalKitchenSinkServer() { RIVET_RUN_ENGINE: "1", RIVET_ENGINE_BINARY: resolveEngineBinary(), RIVETKIT_RUNTIME: process.env.RIVETKIT_RUNTIME ?? "native", + RIVETKIT_STORAGE_PATH: + process.env.RIVETKIT_STORAGE_PATH ?? dbRoot, RIVET_SERVERLESS_URL: SERVERLESS_URL, RIVET__FILE_SYSTEM__PATH: process.env.RIVET__FILE_SYSTEM__PATH ?? join(dbRoot, "db"), @@ -1155,6 +1172,30 @@ async function runProbeLoop(webSocketUrl: string, stopAt: number) { return stats; } +function validateBypassSleepStatus( + source: string, + value: { + sleepStarted?: unknown; + sleepStartedAt?: unknown; + }, +) { + if (typeof value.sleepStarted !== "boolean") { + throw new Error(`${source} missing boolean sleepStarted`); + } + if (value.sleepStarted) { + if (typeof value.sleepStartedAt !== "number") { + throw new Error(`${source} missing numeric sleepStartedAt`); + } + } else if (value.sleepStartedAt !== null) { + throw new Error(`${source} expected null sleepStartedAt before sleep`); + } + + return { + sleepStarted: value.sleepStarted, + sleepStartedAt: value.sleepStartedAt, + }; +} + async function runBypassAttempt( handle: BypassHandle, stats: BypassStats, @@ -1180,7 +1221,7 @@ async function runBypassAttempt( method: "GET", signal: controller.signal, gateway: { - bypassConnectable: true, + skipReadyWait: true, }, }), "bypass http", @@ -1194,15 +1235,24 @@ async function runBypassAttempt( const body = (await response.json()) as { type?: string; transport?: string; + sleepStarted?: unknown; + sleepStartedAt?: unknown; }; if (body.type !== "bypass" || body.transport !== "http") { throw new Error(`unexpected bypass http body ${JSON.stringify(body)}`); } + const sleepStatus = validateBypassSleepStatus("bypass http", body); stats.httpSuccesses += 1; if (phase === "beforeSleep") { stats.beforeSleepHttpSuccesses += 1; + if (sleepStatus.sleepStarted) { + stats.beforeSleepHttpUnexpectedSleepStarted += 1; + } } else { stats.afterSleepHttpSuccesses += 1; + if (sleepStatus.sleepStarted) { + stats.afterSleepHttpSleepStarted += 1; + } } } finally { clearTimeout(abortTimeout); @@ -1211,7 +1261,7 @@ async function runBypassAttempt( const ws = await withTimeout( handle.webSocket("/bypass", undefined, { gateway: { - bypassConnectable: true, + skipReadyWait: true, }, }), "bypass websocket create", @@ -1223,6 +1273,7 @@ async function runBypassAttempt( "bypass websocket open", BYPASS_TIMEOUT_MS, ); + let webSocketSleepStarted = false; const pong = new Promise((resolve, reject) => { const timeoutHandle = setTimeout(() => { cleanup(); @@ -1240,6 +1291,10 @@ async function runBypassAttempt( if (message.type !== "pong" || message.probeId !== probeId) { return; } + webSocketSleepStarted = validateBypassSleepStatus( + "bypass websocket", + message, + ).sleepStarted; cleanup(); resolve(); }; @@ -1264,8 +1319,14 @@ async function runBypassAttempt( stats.webSocketSuccesses += 1; if (phase === "beforeSleep") { stats.beforeSleepWebSocketSuccesses += 1; + if (webSocketSleepStarted) { + stats.beforeSleepWebSocketUnexpectedSleepStarted += 1; + } } else { stats.afterSleepWebSocketSuccesses += 1; + if (webSocketSleepStarted) { + stats.afterSleepWebSocketSleepStarted += 1; + } } } finally { if ( @@ -1300,9 +1361,13 @@ async function runBypassLoop( httpSuccesses: 0, beforeSleepHttpSuccesses: 0, afterSleepHttpSuccesses: 0, + beforeSleepHttpUnexpectedSleepStarted: 0, + afterSleepHttpSleepStarted: 0, webSocketSuccesses: 0, beforeSleepWebSocketSuccesses: 0, afterSleepWebSocketSuccesses: 0, + beforeSleepWebSocketUnexpectedSleepStarted: 0, + afterSleepWebSocketSleepStarted: 0, timeouts: [], errors: [], }; @@ -1430,7 +1495,7 @@ async function runWorkload() { }; console.log( - `[start] endpoint=${ENDPOINT} namespace=${NAMESPACE} pool=${POOL_NAME} actorId=${actorId} ${label} durationMs=${DURATION_MS} sleepIntervalMs=${SLEEP_INTERVAL_MS} inferenceSeconds=${INFERENCE_MIN_SECONDS}-${INFERENCE_MAX_SECONDS} jitterMs=${JITTER_MIN_MS}-${JITTER_MAX_MS} probeIntervalMs=${PROBE_INTERVAL_MS} bypassIntervalMs=${BYPASS_INTERVAL_MS}`, + `[start] endpoint=${ENDPOINT} namespace=${NAMESPACE} pool=${POOL_NAME} actorId=${actorId} ${label} durationMs=${DURATION_MS} sleepIntervalMs=${SLEEP_INTERVAL_MS} onSleepDelayMs=${ON_SLEEP_DELAY_MS} sleepCloseTimeoutMs=${SLEEP_CLOSE_TIMEOUT_MS} inferenceSeconds=${INFERENCE_MIN_SECONDS}-${INFERENCE_MAX_SECONDS} jitterMs=${JITTER_MIN_MS}-${JITTER_MAX_MS} probeIntervalMs=${PROBE_INTERVAL_MS} bypassIntervalMs=${BYPASS_INTERVAL_MS}`, ); const session = new RawSession(webSocketUrl, label); @@ -1557,7 +1622,7 @@ async function runWorkload() { await verifyAll(verifier, expectedRequests); console.log( - `[done] actorId=${actorId} key=${key} requests=${requestCount} sleepPosts=${sleepResult.posts} sleepErrors=${sleepResult.errors} reconnects=${reconnectCount} maxReconnectMs=${maxReconnectMs} probeAttempts=${probeResult.attempts} probeSuccesses=${probeResult.successes} probeExpectedCloses=${probeResult.expectedCloses} bypassAttempts=${bypassResult.attempts} bypassBeforeSleepAttempts=${bypassResult.beforeSleepAttempts} bypassAfterSleepAttempts=${bypassResult.afterSleepAttempts} bypassHttpSuccesses=${bypassResult.httpSuccesses} bypassWebSocketSuccesses=${bypassResult.webSocketSuccesses} bypassBeforeSleepHttpSuccesses=${bypassResult.beforeSleepHttpSuccesses} bypassBeforeSleepWebSocketSuccesses=${bypassResult.beforeSleepWebSocketSuccesses} bypassAfterSleepHttpSuccesses=${bypassResult.afterSleepHttpSuccesses} bypassAfterSleepWebSocketSuccesses=${bypassResult.afterSleepWebSocketSuccesses} bypassTimeouts=${bypassResult.timeouts.length} bypassErrors=${bypassResult.errors.length}`, + `[done] actorId=${actorId} key=${key} requests=${requestCount} sleepPosts=${sleepResult.posts} sleepErrors=${sleepResult.errors} reconnects=${reconnectCount} maxReconnectMs=${maxReconnectMs} probeAttempts=${probeResult.attempts} probeSuccesses=${probeResult.successes} probeExpectedCloses=${probeResult.expectedCloses} bypassAttempts=${bypassResult.attempts} bypassBeforeSleepAttempts=${bypassResult.beforeSleepAttempts} bypassAfterSleepAttempts=${bypassResult.afterSleepAttempts} bypassHttpSuccesses=${bypassResult.httpSuccesses} bypassWebSocketSuccesses=${bypassResult.webSocketSuccesses} bypassBeforeSleepHttpSuccesses=${bypassResult.beforeSleepHttpSuccesses} bypassBeforeSleepWebSocketSuccesses=${bypassResult.beforeSleepWebSocketSuccesses} bypassAfterSleepHttpSuccesses=${bypassResult.afterSleepHttpSuccesses} bypassAfterSleepWebSocketSuccesses=${bypassResult.afterSleepWebSocketSuccesses} bypassAfterSleepHttpSleepStarted=${bypassResult.afterSleepHttpSleepStarted} bypassAfterSleepWebSocketSleepStarted=${bypassResult.afterSleepWebSocketSleepStarted} bypassTimeouts=${bypassResult.timeouts.length} bypassErrors=${bypassResult.errors.length}`, ); if (DURATION_MS >= SLEEP_INTERVAL_MS && sleepResult.posts === 0) { @@ -1616,9 +1681,36 @@ async function runWorkload() { `bypass loop had pre-sleep failures: ${JSON.stringify(bypassResult)}`, ); } + if ( + bypassResult.beforeSleepHttpUnexpectedSleepStarted > 0 || + bypassResult.beforeSleepWebSocketUnexpectedSleepStarted > 0 + ) { + throw new Error( + `bypass saw sleepStarted before sleep: ${JSON.stringify(bypassResult)}`, + ); + } if (sleepResult.posts > 0 && bypassResult.afterSleepAttempts === 0) { throw new Error("bypass loop did not continue after sleep request"); } + if (sleepResult.posts > 0 && bypassResult.afterSleepHttpSuccesses === 0) { + throw new Error("bypass http had no successful after-sleep actor responses"); + } + if (sleepResult.posts > 0 && bypassResult.afterSleepWebSocketSuccesses === 0) { + throw new Error("bypass websocket had no successful after-sleep actor responses"); + } + if (sleepResult.posts > 0 && bypassResult.afterSleepHttpSleepStarted === 0) { + throw new Error( + `bypass http never returned actor sleepStarted proof: ${JSON.stringify(bypassResult)}`, + ); + } + if ( + sleepResult.posts > 0 && + bypassResult.afterSleepWebSocketSleepStarted === 0 + ) { + throw new Error( + `bypass websocket never returned actor sleepStarted proof: ${JSON.stringify(bypassResult)}`, + ); + } } async function main() { diff --git a/examples/kitchen-sink/scripts/sqlite-realworld-bench.ts b/examples/kitchen-sink/scripts/sqlite-realworld-bench.ts index 02ea4bb491..5fe05832c0 100644 --- a/examples/kitchen-sink/scripts/sqlite-realworld-bench.ts +++ b/examples/kitchen-sink/scripts/sqlite-realworld-bench.ts @@ -25,8 +25,6 @@ const DEFAULT_STARTUP_PRELOAD_MAX_BYTES = 1024 * 1024; const DEFAULT_STARTUP_PRELOAD_FIRST_PAGE_COUNT = 1; const DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES = 50_000; const DEFAULT_VFS_PROTECTED_CACHE_PAGES = 512; -const DEFAULT_READ_POOL_MAX_READERS = 4; -const DEFAULT_READ_POOL_IDLE_TTL_MS = 60_000; const DEFAULT_BENCH_VFS_ROUND_TRIP_LATENCY_MS = 10; const BENCH_VFS_ROUND_TRIP_LATENCY_MS_ENV = "RIVETKIT_SQLITE_BENCH_VFS_ROUND_TRIP_LATENCY_MS"; @@ -47,19 +45,12 @@ const SQLITE_OPT_BOOLEAN_ENVS = [ "RIVETKIT_SQLITE_OPT_PRELOAD_HINT_HOT_PAGES", "RIVETKIT_SQLITE_OPT_PRELOAD_HINT_EARLY_PAGES", "RIVETKIT_SQLITE_OPT_PRELOAD_HINT_SCAN_RANGES", - "RIVETKIT_SQLITE_OPT_CACHE_GET_PAGES_VALIDATION", - "RIVETKIT_SQLITE_OPT_RANGE_READS", - "RIVETKIT_SQLITE_OPT_BATCH_CHUNK_READS", - "RIVETKIT_SQLITE_OPT_DECODED_LTX_CACHE", - "RIVETKIT_SQLITE_OPT_READ_POOL_ENABLED", ] as const; const SQLITE_OPT_NUMERIC_ENVS = [ "RIVETKIT_SQLITE_OPT_STARTUP_PRELOAD_MAX_BYTES", "RIVETKIT_SQLITE_OPT_STARTUP_PRELOAD_FIRST_PAGE_COUNT", "RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_CAPACITY_PAGES", "RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES", - "RIVETKIT_SQLITE_OPT_READ_POOL_MAX_READERS", - "RIVETKIT_SQLITE_OPT_READ_POOL_IDLE_TTL_MS", ] as const; const WORKLOADS = [ @@ -187,9 +178,6 @@ interface MatrixScenarioReport { fetchedPages: number; cacheHits: number; cacheMisses: number; - routedReads: number; - writeFallbacks: number; - modeTransitions: number; }>; } @@ -204,7 +192,6 @@ interface BenchmarkResult { setup: SetupResult | null; main: MainResult; vfsMetrics: VfsMetricSnapshot; - readPoolMetrics: ReadPoolMetricSnapshot; } interface VfsMetricSnapshot { @@ -221,23 +208,6 @@ interface VfsMetricSnapshot { getPagesDurationSecondsCount: number; } -interface ReadPoolMetricSnapshot { - activeReaders: number; - idleReaders: number; - readWaitDurationSecondsSum: number; - readWaitDurationSecondsCount: number; - writeWaitDurationSecondsSum: number; - writeWaitDurationSecondsCount: number; - routedReadQueriesTotal: number; - writeFallbackQueriesTotal: number; - manualTransactionDurationSecondsSum: number; - manualTransactionDurationSecondsCount: number; - readerOpensTotal: number; - readerClosesTotal: number; - rejectedReaderMutationsTotal: number; - modeTransitionsTotal: number; -} - const WORKLOAD_SPECS: WorkloadSpec[] = [ { // Included to keep tiny actor databases honest while we optimize larger datasets. @@ -320,16 +290,14 @@ const WORKLOAD_SPECS: WorkloadSpec[] = [ description: "Selective tenant/time-range aggregate over events joined to orders.", }, { - // Included to measure future read-mode parallelism where several read-only SQLite connections overlap VFS misses. - // Today this captures the serialized baseline; after the connection manager lands, independent aggregate reads should overlap. + // Included to measure concurrent read-only aggregate pressure over the same VFS and transport. name: "parallel-read-aggregates", category: "read", sizeClass: "large", description: "Concurrent read-only aggregates over one actor-local SQLite database.", }, { - // Included to measure the read-mode to write-mode transition. - // Future write mode must wait for active readers, close them, run exactly one writable connection, then allow fresh readers. + // Included to measure read pressure queued alongside a write update. name: "parallel-read-write-transition", category: "write", sizeClass: "medium", @@ -417,7 +385,6 @@ const WORKLOAD_SPECS: WorkloadSpec[] = [ }, { // Included to model tool-like fan-out where an agent asks for recent rows, indexed rows, and aggregates concurrently. - // Parallel read pool routing should make these independent read-only statements overlap once TS serialization is gone. name: "chat-tool-read-fanout", category: "read", sizeClass: "large", @@ -525,11 +492,6 @@ function defaultSqliteOptimizationEnv(): Record { RIVETKIT_SQLITE_OPT_PRELOAD_HINT_HOT_PAGES: "true", RIVETKIT_SQLITE_OPT_PRELOAD_HINT_EARLY_PAGES: "true", RIVETKIT_SQLITE_OPT_PRELOAD_HINT_SCAN_RANGES: "true", - RIVETKIT_SQLITE_OPT_CACHE_GET_PAGES_VALIDATION: "true", - RIVETKIT_SQLITE_OPT_RANGE_READS: "true", - RIVETKIT_SQLITE_OPT_BATCH_CHUNK_READS: "true", - RIVETKIT_SQLITE_OPT_DECODED_LTX_CACHE: "true", - RIVETKIT_SQLITE_OPT_READ_POOL_ENABLED: "true", RIVETKIT_SQLITE_OPT_STARTUP_PRELOAD_MAX_BYTES: DEFAULT_STARTUP_PRELOAD_MAX_BYTES.toString(), RIVETKIT_SQLITE_OPT_STARTUP_PRELOAD_FIRST_PAGE_COUNT: @@ -538,10 +500,6 @@ function defaultSqliteOptimizationEnv(): Record { DEFAULT_VFS_PAGE_CACHE_CAPACITY_PAGES.toString(), RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES: DEFAULT_VFS_PROTECTED_CACHE_PAGES.toString(), - RIVETKIT_SQLITE_OPT_READ_POOL_MAX_READERS: - DEFAULT_READ_POOL_MAX_READERS.toString(), - RIVETKIT_SQLITE_OPT_READ_POOL_IDLE_TTL_MS: - DEFAULT_READ_POOL_IDLE_TTL_MS.toString(), }; } @@ -588,33 +546,14 @@ const SQLITE_OPTIMIZATION_MATRIX_SCENARIOS: MatrixScenario[] = [ }, includeInImpact: true, }, - { - id: "transport-batching-only", - label: "Transport batching only", - description: - "Range reads, chunk batching, and decoded LTX cache enabled with VFS cache, preload, read-ahead, and read pool disabled.", - env: scenarioEnv({ - ...preloadDisabledEnv(), - RIVETKIT_SQLITE_OPT_READ_AHEAD_MODE: "off", - RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE: "off", - RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_CAPACITY_PAGES: "0", - RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES: "0", - RIVETKIT_SQLITE_OPT_READ_POOL_ENABLED: "false", - }), - includeInImpact: true, - }, { id: "vfs-cache-only", label: "VFS cache only", description: - "VFS page cache enabled without read-ahead, preload hints, range reads, storage decode cache, or read pool.", + "VFS page cache enabled without read-ahead or preload hints.", env: scenarioEnv({ ...preloadDisabledEnv(), RIVETKIT_SQLITE_OPT_READ_AHEAD_MODE: "off", - RIVETKIT_SQLITE_OPT_RANGE_READS: "false", - RIVETKIT_SQLITE_OPT_BATCH_CHUNK_READS: "false", - RIVETKIT_SQLITE_OPT_DECODED_LTX_CACHE: "false", - RIVETKIT_SQLITE_OPT_READ_POOL_ENABLED: "false", }), includeInImpact: true, }, @@ -622,13 +561,12 @@ const SQLITE_OPTIMIZATION_MATRIX_SCENARIOS: MatrixScenario[] = [ id: "read-ahead-no-cache", label: "Read-ahead without VFS cache", description: - "Adaptive read-ahead and range reads enabled while prefetched pages are not retained in the VFS cache.", + "Adaptive read-ahead enabled while prefetched pages are not retained in the VFS cache.", env: scenarioEnv({ ...preloadDisabledEnv(), RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_MODE: "off", RIVETKIT_SQLITE_OPT_VFS_PAGE_CACHE_CAPACITY_PAGES: "0", RIVETKIT_SQLITE_OPT_VFS_PROTECTED_CACHE_PAGES: "0", - RIVETKIT_SQLITE_OPT_READ_POOL_ENABLED: "false", }), includeInImpact: true, }, @@ -697,37 +635,6 @@ const SQLITE_OPTIMIZATION_MATRIX_SCENARIOS: MatrixScenario[] = [ env: scenarioEnv(preloadDisabledEnv()), includeInImpact: true, }, - { - id: "no-range-reads", - label: "Default minus range reads", - description: "Current defaults with contiguous range page reads disabled.", - env: scenarioEnv({ - RIVETKIT_SQLITE_OPT_RANGE_READS: "false", - }), - includeInImpact: true, - }, - { - id: "no-storage-read-cache", - label: "Default minus storage read cache", - description: - "Current defaults with chunk batching and decoded LTX cache disabled.", - env: scenarioEnv({ - RIVETKIT_SQLITE_OPT_BATCH_CHUNK_READS: "false", - RIVETKIT_SQLITE_OPT_DECODED_LTX_CACHE: "false", - }), - includeInImpact: true, - }, - { - id: "no-read-pool", - label: "Default minus read pool", - description: "Current defaults with the SQLite read connection pool disabled.", - env: scenarioEnv({ - RIVETKIT_SQLITE_OPT_READ_POOL_ENABLED: "false", - RIVETKIT_SQLITE_OPT_READ_POOL_MAX_READERS: "0", - RIVETKIT_SQLITE_OPT_READ_POOL_IDLE_TTL_MS: "0", - }), - includeInImpact: true, - }, ]; function usage(exitCode = 1): never { @@ -1293,55 +1200,6 @@ function scrapeVfsMetrics(text: string): VfsMetricSnapshot { }; } -function scrapeReadPoolMetrics(text: string): ReadPoolMetricSnapshot { - return { - activeReaders: metricValue(text, "sqlite_read_pool_active_readers"), - idleReaders: metricValue(text, "sqlite_read_pool_idle_readers"), - readWaitDurationSecondsSum: metricValue( - text, - "sqlite_read_pool_read_wait_duration_seconds_sum", - ), - readWaitDurationSecondsCount: metricValue( - text, - "sqlite_read_pool_read_wait_duration_seconds_count", - ), - writeWaitDurationSecondsSum: metricValue( - text, - "sqlite_read_pool_write_wait_duration_seconds_sum", - ), - writeWaitDurationSecondsCount: metricValue( - text, - "sqlite_read_pool_write_wait_duration_seconds_count", - ), - routedReadQueriesTotal: metricValue( - text, - "sqlite_read_pool_routed_read_queries_total", - ), - writeFallbackQueriesTotal: metricValue( - text, - "sqlite_read_pool_write_fallback_queries_total", - ), - manualTransactionDurationSecondsSum: metricValue( - text, - "sqlite_read_pool_manual_transaction_duration_seconds_sum", - ), - manualTransactionDurationSecondsCount: metricValue( - text, - "sqlite_read_pool_manual_transaction_duration_seconds_count", - ), - readerOpensTotal: metricValue(text, "sqlite_read_pool_reader_opens_total"), - readerClosesTotal: metricValue(text, "sqlite_read_pool_reader_closes_total"), - rejectedReaderMutationsTotal: metricValue( - text, - "sqlite_read_pool_rejected_reader_mutations_total", - ), - modeTransitionsTotal: metricValue( - text, - "sqlite_read_pool_mode_transitions_total", - ), - }; -} - function diffMetrics(after: T, before: T): T { return Object.fromEntries( Object.keys(after).map((key) => [ @@ -1367,25 +1225,6 @@ function emptyVfsMetrics(): VfsMetricSnapshot { }; } -function emptyReadPoolMetrics(): ReadPoolMetricSnapshot { - return { - activeReaders: 0, - idleReaders: 0, - readWaitDurationSecondsSum: 0, - readWaitDurationSecondsCount: 0, - writeWaitDurationSecondsSum: 0, - writeWaitDurationSecondsCount: 0, - routedReadQueriesTotal: 0, - writeFallbackQueriesTotal: 0, - manualTransactionDurationSecondsSum: 0, - manualTransactionDurationSecondsCount: 0, - readerOpensTotal: 0, - readerClosesTotal: 0, - rejectedReaderMutationsTotal: 0, - modeTransitionsTotal: 0, - }; -} - function writeResults(outputDir: string, document: unknown): void { mkdirSync(outputDir, { recursive: true }); writeFileSync( @@ -1400,8 +1239,8 @@ function writeSummary(outputDir: string, results: BenchmarkResult[]): void { "", "Server SQLite time only. Setup time, sleep delay, wake/cold-start time, and client RTT are not included.", "", - "| workload | category | size | server_ms | routed_reads | write_fallbacks | mode_transitions | reader_opens | reader_closes | get_pages | fetched_pages | cache_hits | cache_misses | rows/ops | pages |", - "| --- | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |", + "| workload | category | size | server_ms | get_pages | fetched_pages | cache_hits | cache_misses | rows/ops | pages |", + "| --- | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |", ]; for (const result of results) { const rowsOrOps = @@ -1411,7 +1250,7 @@ function writeSummary(outputDir: string, results: BenchmarkResult[]): void { ? result.main.ops : ""; lines.push( - `| ${result.workload} | ${result.category} | ${fmtBytes(result.targetBytes)} | ${result.main.ms.toFixed(1)} | ${result.readPoolMetrics.routedReadQueriesTotal} | ${result.readPoolMetrics.writeFallbackQueriesTotal} | ${result.readPoolMetrics.modeTransitionsTotal} | ${result.readPoolMetrics.readerOpensTotal} | ${result.readPoolMetrics.readerClosesTotal} | ${result.vfsMetrics.getPagesTotal} | ${result.vfsMetrics.pagesFetchedTotal} | ${result.vfsMetrics.resolvePagesCacheHitsTotal} | ${result.vfsMetrics.resolvePagesCacheMissesTotal} | ${rowsOrOps} | ${result.main.pageCount} |`, + `| ${result.workload} | ${result.category} | ${fmtBytes(result.targetBytes)} | ${result.main.ms.toFixed(1)} | ${result.vfsMetrics.getPagesTotal} | ${result.vfsMetrics.pagesFetchedTotal} | ${result.vfsMetrics.resolvePagesCacheHitsTotal} | ${result.vfsMetrics.resolvePagesCacheMissesTotal} | ${rowsOrOps} | ${result.main.pageCount} |`, ); } writeFileSync(join(outputDir, "summary.md"), `${lines.join("\n")}\n`); @@ -1517,9 +1356,6 @@ function readScenarioDocument(args: Args, scenario: MatrixScenario): MatrixScena fetchedPages: result.vfsMetrics.pagesFetchedTotal, cacheHits: result.vfsMetrics.resolvePagesCacheHitsTotal, cacheMisses: result.vfsMetrics.resolvePagesCacheMissesTotal, - routedReads: result.readPoolMetrics.routedReadQueriesTotal, - writeFallbacks: result.readPoolMetrics.writeFallbackQueriesTotal, - modeTransitions: result.readPoolMetrics.modeTransitionsTotal, })), }; } @@ -1551,8 +1387,8 @@ function writeMatrixSummary(outputDir: string, scenarios: MatrixScenarioReport[] "", "Each scenario runs in a fresh process so process-wide SQLite optimization flags are read once per configuration.", "", - "| scenario | workload | server_ms | delta_vs_defaults | get_pages | fetched_pages | cache_hits | cache_misses | routed_reads | write_fallbacks |", - "| --- | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |", + "| scenario | workload | server_ms | delta_vs_defaults | get_pages | fetched_pages | cache_hits | cache_misses |", + "| --- | --- | ---: | ---: | ---: | ---: | ---: | ---: |", ]; for (const scenario of scenarios) { @@ -1564,7 +1400,7 @@ function writeMatrixSummary(outputDir: string, scenarios: MatrixScenarioReport[] ? `${(((result.serverMs - base.serverMs) / base.serverMs) * 100).toFixed(1)}%` : ""; lines.push( - `| ${scenario.id} | ${result.workload} | ${result.serverMs.toFixed(1)} | ${delta} | ${result.getPages} | ${result.fetchedPages} | ${result.cacheHits} | ${result.cacheMisses} | ${result.routedReads} | ${result.writeFallbacks} |`, + `| ${scenario.id} | ${result.workload} | ${result.serverMs.toFixed(1)} | ${delta} | ${result.getPages} | ${result.fetchedPages} | ${result.cacheHits} | ${result.cacheMisses} |`, ); } } @@ -1798,12 +1634,8 @@ async function main(): Promise { scrapeVfsMetrics(afterMainMetricsText), emptyVfsMetrics(), ); - const readPoolMetrics = diffMetrics( - scrapeReadPoolMetrics(afterMainMetricsText), - emptyReadPoolMetrics(), - ); console.log( - ` server=${fmtMs(mainResult.ms)} pages=${mainResult.pageCount} routed_reads=${readPoolMetrics.routedReadQueriesTotal} write_fallbacks=${readPoolMetrics.writeFallbackQueriesTotal} mode_transitions=${readPoolMetrics.modeTransitionsTotal} get_pages=${vfsMetrics.getPagesTotal} fetched_pages=${vfsMetrics.pagesFetchedTotal}`, + ` server=${fmtMs(mainResult.ms)} pages=${mainResult.pageCount} get_pages=${vfsMetrics.getPagesTotal} fetched_pages=${vfsMetrics.pagesFetchedTotal}`, ); results.push({ @@ -1817,7 +1649,6 @@ async function main(): Promise { setup, main: mainResult, vfsMetrics, - readPoolMetrics, }); writeResults(outputDir, resultDocument); writeSummary(outputDir, results); @@ -1830,7 +1661,7 @@ async function main(): Promise { console.log("\nResults"); for (const result of results) { console.log( - ` ${result.workload}: server=${fmtMs(result.main.ms)} size=${fmtBytes(result.targetBytes)} routed_reads=${result.readPoolMetrics.routedReadQueriesTotal} write_fallbacks=${result.readPoolMetrics.writeFallbackQueriesTotal} mode_transitions=${result.readPoolMetrics.modeTransitionsTotal} get_pages=${result.vfsMetrics.getPagesTotal} fetched_pages=${result.vfsMetrics.pagesFetchedTotal}`, + ` ${result.workload}: server=${fmtMs(result.main.ms)} size=${fmtBytes(result.targetBytes)} get_pages=${result.vfsMetrics.getPagesTotal} fetched_pages=${result.vfsMetrics.pagesFetchedTotal}`, ); } console.log(`\nwrote ${join(outputDir, "results.json")}`); diff --git a/examples/kitchen-sink/src/actors/testing/mock-agentic-loop.ts b/examples/kitchen-sink/src/actors/testing/mock-agentic-loop.ts index 2802fd38fa..3e0a654cd0 100644 --- a/examples/kitchen-sink/src/actors/testing/mock-agentic-loop.ts +++ b/examples/kitchen-sink/src/actors/testing/mock-agentic-loop.ts @@ -18,11 +18,45 @@ type CountRow = { count: number; }; +type SleepStateRow = { + sleep_started_at: number; +}; + +type DebugEventRow = { + event_id: string; + name: string; + actor_id: string; + connection_id: string | null; + request_id: string | null; + details_json: string; + created_at: number; +}; + type ExpectedRequest = { requestId: string; seconds: number; }; +type DebugEventInput = { + name: string; + connectionId?: string; + requestId?: string; + details?: Record; + createdAt?: number; +}; + +type DebugContext = { + actorId: string; + db: { + execute: (query: string, ...params: unknown[]) => Promise; + }; + log: { + warn: (payload: unknown) => void; + }; +}; + +const debugSocketsByActorId = new Map>(); + function sleep(ms: number): Promise { return new Promise((resolve) => setTimeout(resolve, ms)); } @@ -64,6 +98,96 @@ function send(websocket: UniversalWebSocket, payload: unknown) { websocket.send(JSON.stringify(payload)); } +function debugPayload(row: DebugEventRow, replayed: boolean) { + return { + type: "debugEvent", + eventId: row.event_id, + name: row.name, + actorId: row.actor_id, + connectionId: row.connection_id, + requestId: row.request_id, + details: JSON.parse(row.details_json) as Record, + createdAt: row.created_at, + replayed, + }; +} + +function publishDebugEvent(row: DebugEventRow) { + const sockets = debugSocketsByActorId.get(row.actor_id); + if (!sockets) return; + + for (const socket of sockets) { + send(socket, debugPayload(row, false)); + } +} + +function addDebugSocket(actorId: string, websocket: UniversalWebSocket) { + const sockets = debugSocketsByActorId.get(actorId) ?? new Set(); + sockets.add(websocket); + debugSocketsByActorId.set(actorId, sockets); + + return () => { + sockets.delete(websocket); + if (sockets.size === 0) { + debugSocketsByActorId.delete(actorId); + } + }; +} + +async function recordDebugEvent(c: DebugContext, input: DebugEventInput) { + const row: DebugEventRow = { + event_id: crypto.randomUUID(), + name: input.name, + actor_id: c.actorId, + connection_id: input.connectionId ?? null, + request_id: input.requestId ?? null, + details_json: JSON.stringify(input.details ?? {}), + created_at: input.createdAt ?? Date.now(), + }; + + try { + await c.db.execute( + "INSERT INTO mock_agentic_debug_events (event_id, name, actor_id, connection_id, request_id, details_json, created_at) VALUES (?, ?, ?, ?, ?, ?, ?)", + row.event_id, + row.name, + row.actor_id, + row.connection_id, + row.request_id, + row.details_json, + row.created_at, + ); + publishDebugEvent(row); + } catch (error) { + c.log.warn({ + msg: "mock agentic debug event failed", + name: input.name, + err: error instanceof Error ? error.message : String(error), + }); + } +} + +async function replayDebugEvents( + database: DebugContext["db"], + websocket: UniversalWebSocket, +) { + const rows = typedRows( + await database.execute(` + SELECT event_id, name, actor_id, connection_id, request_id, details_json, created_at + FROM ( + SELECT event_id, name, actor_id, connection_id, request_id, details_json, created_at + FROM mock_agentic_debug_events + ORDER BY created_at DESC + LIMIT 200 + ) + ORDER BY created_at ASC + `), + ); + + for (const row of rows) { + send(websocket, debugPayload(row, true)); + } +} + function verifyEntryRows(rows: EntryRow[], expectedSeconds: number) { const seen = new Set(); const indexes = rows.map((row) => row.idx).sort((a, b) => a - b); @@ -153,25 +277,82 @@ export const mockAgenticLoop = actor({ await database.execute( "CREATE INDEX IF NOT EXISTS idx_mock_agentic_entries_created_at ON mock_agentic_entries(created_at)", ); + await database.execute(` + CREATE TABLE IF NOT EXISTS mock_agentic_sleep_state ( + id INTEGER PRIMARY KEY CHECK (id = 1), + sleep_started_at INTEGER NOT NULL + ) + `); + await database.execute(` + CREATE TABLE IF NOT EXISTS mock_agentic_debug_events ( + event_id TEXT PRIMARY KEY, + name TEXT NOT NULL, + actor_id TEXT NOT NULL, + connection_id TEXT, + request_id TEXT, + details_json TEXT NOT NULL, + created_at INTEGER NOT NULL + ) + `); + await database.execute( + "CREATE INDEX IF NOT EXISTS idx_mock_agentic_debug_events_created_at ON mock_agentic_debug_events(created_at)", + ); }, }), + async onWake(c) { + await recordDebugEvent(c, { + name: "onWake", + details: { + key: c.key, + name: c.name, + }, + }); + }, async onSleep(c) { const delayMs = numberFromEnv( "MOCK_AGENTIC_ON_SLEEP_DELAY_MS", DEFAULT_ON_SLEEP_DELAY_MS, ); + const sleepStartedAt = Date.now(); + await recordDebugEvent(c, { + name: "onSleepStart", + createdAt: sleepStartedAt, + details: { + delayMs, + }, + }); + await c.db.execute( + "INSERT OR REPLACE INTO mock_agentic_sleep_state (id, sleep_started_at) VALUES (1, ?)", + sleepStartedAt, + ); c.log.info({ msg: "mock agentic loop onSleep delay", delayMs, + sleepStartedAt, }); await sleep(delayMs); + await recordDebugEvent(c, { + name: "onSleepEnd", + details: { + delayMs, + sleepStartedAt, + elapsedMs: Date.now() - sleepStartedAt, + }, + }); }, - onRequest(_c, request) { + async onRequest(c, request) { const url = new URL(request.url); if (url.pathname === "/bypass" || url.pathname === "/request/bypass") { + const [sleepState] = typedRows( + await c.db.execute( + "SELECT sleep_started_at FROM mock_agentic_sleep_state WHERE id = 1", + ), + ); return new Response(JSON.stringify({ type: "bypass", transport: "http", + sleepStarted: sleepState !== undefined, + sleepStartedAt: sleepState?.sleep_started_at ?? null, timestamp: Date.now(), }), { headers: { @@ -185,12 +366,27 @@ export const mockAgenticLoop = actor({ onWebSocket(c, websocket: UniversalWebSocket) { const connectionId = crypto.randomUUID(); let activeInference: Promise | undefined; + const removeDebugSocket = addDebugSocket(c.actorId, websocket); send(websocket, { type: "hello", connectionId, timestamp: Date.now(), }); + void (async () => { + try { + await replayDebugEvents(c.db, websocket); + } catch (error) { + c.log.warn({ + msg: "mock agentic debug replay failed", + err: error instanceof Error ? error.message : String(error), + }); + } + await recordDebugEvent(c, { + name: "webSocketOpen", + connectionId, + }); + })(); const verify = async (requestId: string, expectedSeconds: number) => { const rows = typedRows( @@ -206,6 +402,18 @@ export const mockAgenticLoop = actor({ }; }; + const sleepStatus = async () => { + const [sleepState] = typedRows( + await c.db.execute( + "SELECT sleep_started_at FROM mock_agentic_sleep_state WHERE id = 1", + ), + ); + return { + sleepStarted: sleepState !== undefined, + sleepStartedAt: sleepState?.sleep_started_at ?? null, + }; + }; + const runInference = async (requestId: string, seconds: number) => { send(websocket, { type: "started", @@ -284,6 +492,7 @@ export const mockAgenticLoop = actor({ send(websocket, { type: "pong", probeId: stringValue(message.probeId, "probeId"), + ...(await sleepStatus()), timestamp: Date.now(), }); return; @@ -318,6 +527,14 @@ export const mockAgenticLoop = actor({ message.seconds, "seconds", ); + await recordDebugEvent(c, { + name: "inferenceRequested", + connectionId, + requestId, + details: { + seconds, + }, + }); const inference = runInference( requestId, seconds, @@ -342,6 +559,14 @@ export const mockAgenticLoop = actor({ } })(); }); + + websocket.addEventListener("close", () => { + removeDebugSocket(); + void recordDebugEvent(c, { + name: "webSocketClose", + connectionId, + }); + }); }, actions: { verify: async (c, requestId: string, expectedSeconds: number) => { diff --git a/examples/kitchen-sink/src/index.ts b/examples/kitchen-sink/src/index.ts index 2378b55215..c3268e9035 100644 --- a/examples/kitchen-sink/src/index.ts +++ b/examples/kitchen-sink/src/index.ts @@ -172,6 +172,8 @@ function serverlessPoolConfig() { export const registry = setup({ configurePool: serverlessPoolConfig(), serverless: { + publicToken: + process.env.RIVET_PUBLIC_TOKEN ?? process.env.RIVET_TOKEN ?? "dev", maxStartPayloadBytes: numberFromEnv( "RIVET_SERVERLESS_MAX_START_PAYLOAD_BYTES", 16 * 1024 * 1024, diff --git a/examples/kitchen-sink/tests/sqlite-realworld-bench.test.ts b/examples/kitchen-sink/tests/sqlite-realworld-bench.test.ts index c85505345f..0424d3e921 100644 --- a/examples/kitchen-sink/tests/sqlite-realworld-bench.test.ts +++ b/examples/kitchen-sink/tests/sqlite-realworld-bench.test.ts @@ -42,16 +42,9 @@ test("SQLite real-world benchmark includes read-mode/write-mode scenarios", () = ); assert.match(actor, /Promise\.all\(\[/); assert.match(actor, /UPDATE rw_orders SET total_cents = total_cents \+ 1/); - for (const metric of [ - "sqlite_read_pool_routed_read_queries_total", - "sqlite_read_pool_write_fallback_queries_total", - "sqlite_read_pool_mode_transitions_total", - ]) { - assert.match(runner, new RegExp(metric)); - } assert.match( runner, - /\| workload \| category \| size \| server_ms \| routed_reads \| write_fallbacks \| mode_transitions \|/, + /\| workload \| category \| size \| server_ms \| get_pages \| fetched_pages \|/, ); }); @@ -66,16 +59,12 @@ test("SQLite real-world benchmark defines an optimization impact matrix", () => for (const scenario of [ "defaults", "all-off", - "transport-batching-only", "vfs-cache-only", "read-ahead-no-cache", "cache-read-ahead-no-preload", "no-read-ahead", "no-vfs-cache", "no-preload", - "no-range-reads", - "no-storage-read-cache", - "no-read-pool", ]) { assert.match(runner, new RegExp(`id: "${scenario}"`)); } diff --git a/examples/kitchen-sink/vite.config.ts b/examples/kitchen-sink/vite.config.ts index e02ede3a0b..9a992b4039 100644 --- a/examples/kitchen-sink/vite.config.ts +++ b/examples/kitchen-sink/vite.config.ts @@ -16,4 +16,13 @@ function sqlRawPlugin(): Plugin { export default defineConfig({ plugins: [react(), sqlRawPlugin()], + server: { + proxy: { + "/api/rivet": { + target: "http://127.0.0.1:3000", + changeOrigin: true, + ws: true, + }, + }, + }, }); diff --git a/rivetkit-rust/packages/rivetkit-core/scripts/check-event-driven-drains.sh b/rivetkit-rust/packages/rivetkit-core/scripts/check-event-driven-drains.sh old mode 100644 new mode 100755 diff --git a/rivetkit-rust/packages/rivetkit-core/src/registry/http.rs b/rivetkit-rust/packages/rivetkit-core/src/registry/http.rs index 08085cca8b..503bd3b00f 100644 --- a/rivetkit-rust/packages/rivetkit-core/src/registry/http.rs +++ b/rivetkit-rust/packages/rivetkit-core/src/registry/http.rs @@ -610,16 +610,6 @@ pub(super) async fn build_http_request(request: HttpRequest) -> Result .with_context(|| format!("build actor request for `{}`", request.path)) } -pub(super) fn is_actor_request_path(path: &str) -> bool { - let Some(stripped) = path.strip_prefix("/request") else { - return false; - }; - if stripped.is_empty() { - return true; - } - matches!(stripped.as_bytes().first(), Some(b'/') | Some(b'?')) -} - pub(super) fn normalize_actor_request_path(path: &str) -> String { let Some(stripped) = path.strip_prefix("/request") else { return path.to_owned(); diff --git a/rivetkit-rust/packages/rivetkit-core/src/serverless.rs b/rivetkit-rust/packages/rivetkit-core/src/serverless.rs index f699a65c19..c837542f97 100644 --- a/rivetkit-rust/packages/rivetkit-core/src/serverless.rs +++ b/rivetkit-rust/packages/rivetkit-core/src/serverless.rs @@ -48,7 +48,6 @@ pub struct CoreServerlessRuntime { struct ServerlessSettings { version: u32, configured_endpoint: String, - configured_token: Option, configured_namespace: String, base_path: String, package_version: String, @@ -177,7 +176,6 @@ impl CoreServerlessRuntime { settings: Arc::new(ServerlessSettings { version: config.version, configured_endpoint: config.endpoint, - configured_token: config.token, configured_namespace: config.namespace, base_path, package_version: config.serverless_package_version, diff --git a/rivetkit-typescript/packages/rivetkit-napi/Cargo.toml b/rivetkit-typescript/packages/rivetkit-napi/Cargo.toml index a91abd06e7..1985e9c9fe 100644 --- a/rivetkit-typescript/packages/rivetkit-napi/Cargo.toml +++ b/rivetkit-typescript/packages/rivetkit-napi/Cargo.toml @@ -20,6 +20,8 @@ anyhow.workspace = true serde.workspace = true serde_json.workspace = true tracing.workspace = true +tracing-logfmt.workspace = true +tracing-stackdriver.workspace = true tracing-subscriber.workspace = true parking_lot.workspace = true scc.workspace = true diff --git a/rivetkit-typescript/packages/rivetkit-napi/src/lib.rs b/rivetkit-typescript/packages/rivetkit-napi/src/lib.rs index 849639cfda..0973f02b34 100644 --- a/rivetkit-typescript/packages/rivetkit-napi/src/lib.rs +++ b/rivetkit-typescript/packages/rivetkit-napi/src/lib.rs @@ -15,10 +15,30 @@ use std::sync::Once; use rivet_error::RivetError as RivetTransportError; use rivetkit_core::error::public_error_status_code; +use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt}; static INIT_TRACING: Once = Once::new(); pub(crate) const BRIDGE_RIVET_ERROR_PREFIX: &str = "__RIVET_ERROR_JSON__:"; +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +enum LogFormat { + Logfmt, + Gcp, +} + +impl LogFormat { + fn from_env() -> Self { + match std::env::var("RUST_LOG_FORMAT") + .unwrap_or_default() + .to_lowercase() + .as_str() + { + "gcp" => LogFormat::Gcp, + _ => LogFormat::Logfmt, + } + } +} + #[derive(rivet_error::RivetError, serde::Serialize)] #[error( "napi", @@ -105,13 +125,34 @@ pub(crate) fn init_tracing(log_level: Option<&str>) { .or_else(|| std::env::var("RUST_LOG").ok()) .unwrap_or_else(|| "warn".to_string()); - tracing_subscriber::fmt() - .json() - .with_env_filter(tracing_subscriber::EnvFilter::new(&filter)) - .with_target(true) - .with_current_span(true) - .with_span_list(false) - .with_writer(std::io::stdout) + let log_format = LogFormat::from_env(); + + tracing_subscriber::registry() + .with(tracing_subscriber::EnvFilter::new(&filter)) + .with(match log_format { + LogFormat::Logfmt => Some( + tracing_logfmt::builder() + .with_span_name(env_flag("RUST_LOG_SPAN_NAME")) + .with_span_path(env_flag("RUST_LOG_SPAN_PATH")) + .with_target(env_flag("RUST_LOG_TARGET") || env_flag("RIVET_LOG_TARGET")) + .with_location(env_flag("RUST_LOG_LOCATION")) + .with_module_path(env_flag("RUST_LOG_MODULE_PATH")) + .with_ansi_color(env_flag("RUST_LOG_ANSI_COLOR")) + .layer(), + ), + LogFormat::Gcp => None, + }) + .with(match log_format { + LogFormat::Logfmt => None, + LogFormat::Gcp => Some( + tracing_stackdriver::layer() + .with_source_location(env_flag("RUST_LOG_LOCATION")), + ), + }) .init(); }); } + +fn env_flag(name: &str) -> bool { + std::env::var(name).map_or(false, |x| x == "1") +} diff --git a/rivetkit-typescript/packages/rivetkit/src/client/actor-common.ts b/rivetkit-typescript/packages/rivetkit/src/client/actor-common.ts index a108d66ef1..193e7acdc2 100644 --- a/rivetkit-typescript/packages/rivetkit/src/client/actor-common.ts +++ b/rivetkit-typescript/packages/rivetkit/src/client/actor-common.ts @@ -33,6 +33,7 @@ export type ActorActionFunction< export interface ActorGatewayOptions { bypassConnectable?: boolean; + skipReadyWait?: boolean; } export type ResolvedActorGatewayOptions = Required; @@ -41,9 +42,16 @@ export function resolveActorGatewayOptions( defaults: ActorGatewayOptions = {}, overrides?: ActorGatewayOptions, ): ResolvedActorGatewayOptions { + const bypassConnectable = + overrides?.bypassConnectable ?? + overrides?.skipReadyWait ?? + defaults.bypassConnectable ?? + defaults.skipReadyWait ?? + false; + return { - bypassConnectable: - overrides?.bypassConnectable ?? defaults.bypassConnectable ?? false, + bypassConnectable, + skipReadyWait: bypassConnectable, }; } diff --git a/rivetkit-typescript/packages/rivetkit/src/common/log.ts b/rivetkit-typescript/packages/rivetkit/src/common/log.ts index 25a7853155..88793147a2 100644 --- a/rivetkit-typescript/packages/rivetkit/src/common/log.ts +++ b/rivetkit-typescript/packages/rivetkit/src/common/log.ts @@ -1,4 +1,5 @@ import { + type DestinationStream, type LevelWithSilent, type Logger, pino, @@ -69,19 +70,22 @@ export function configureDefaultLogger(logLevel?: LogLevel) { configuredLogLevel = logLevel; } - baseLogger = pino({ - level: getPinoLevel(logLevel), - messageKey: "msg", - // Do not include pid/hostname in output - base: {}, - // Keep a string level in the output - formatters: { - level(_label: string, number: number) { - return { level: number }; + baseLogger = pino( + { + level: getPinoLevel(logLevel), + messageKey: "msg", + // Do not include pid/hostname in output + base: {}, + // Keep the numeric level so the logfmt sink can match Pino's levels. + formatters: { + level(_label: string, number: number) { + return { level: number }; + }, }, + timestamp: getLogTimestamp() ? stdTimeFunctions.epochTime : false, }, - timestamp: getLogTimestamp() ? stdTimeFunctions.epochTime : false, - }); + createLogfmtDestination(), + ); loggerCache.clear(); } @@ -117,3 +121,95 @@ export function getLogger(name = "default"): Logger { return child; } + +const PINO_LEVEL_LABELS: Record = { + 10: "trace", + 20: "debug", + 30: "info", + 40: "warn", + 50: "error", + 60: "fatal", +}; + +function createLogfmtDestination(): DestinationStream { + return { + write(msg: string): void { + const line = formatLogfmtLine(msg); + if (typeof process !== "undefined" && process.stdout?.write) { + process.stdout.write(`${line}\n`); + } else { + console.log(line); + } + }, + }; +} + +function formatLogfmtLine(raw: string): string { + let data: Record; + try { + data = JSON.parse(raw); + } catch { + return raw.trimEnd(); + } + + const parts: string[] = []; + appendLogfmtEntry(parts, "level", formatPinoLevel(data.level)); + + if (data.time !== undefined) { + appendLogfmtEntry(parts, "ts", data.time); + } + + for (const [key, value] of Object.entries(data)) { + if (key === "level" || key === "time") { + continue; + } + appendLogfmtEntry(parts, key, value); + } + + return parts.join(" "); +} + +function formatPinoLevel(level: unknown): string { + if (typeof level === "number") { + return PINO_LEVEL_LABELS[level] ?? level.toString(); + } + + if (typeof level === "string") { + return level.toLowerCase(); + } + + return "info"; +} + +function appendLogfmtEntry(parts: string[], key: string, value: unknown): void { + const safeKey = key.replace(/[\s="]/g, ""); + if (safeKey.length === 0) { + return; + } + + parts.push(`${safeKey}=${formatLogfmtValue(value)}`); +} + +function formatLogfmtValue(value: unknown): string { + if (typeof value === "number" || typeof value === "boolean") { + return String(value); + } + + if (value === null || value === undefined) { + return "null"; + } + + if (typeof value === "string") { + return quoteLogfmtString(value); + } + + return quoteLogfmtString(JSON.stringify(value)); +} + +function quoteLogfmtString(value: string): string { + if (!/[\s="]/.test(value)) { + return value; + } + + return `"${value.replace(/\\/g, "\\\\").replace(/"/g, '\\"').replace(/\n/g, "\\n")}"`; +} diff --git a/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-http-client.ts b/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-http-client.ts index 4220e25e0c..4783400d22 100644 --- a/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-http-client.ts +++ b/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-http-client.ts @@ -5,7 +5,10 @@ import { HEADER_RIVET_TARGET, HEADER_RIVET_TOKEN, } from "@/common/actor-router-consts"; -import type { GatewayRequestOptions } from "./driver"; +import { + shouldBypassConnectable, + type GatewayRequestOptions, +} from "./driver"; export interface HttpGatewayRequestOptions extends GatewayRequestOptions { directActorId?: string; @@ -79,7 +82,7 @@ function buildGuardHeaders( headers.set(HEADER_RIVET_TARGET, "actor"); headers.set(HEADER_RIVET_ACTOR, options.directActorId); } - if (options.bypassConnectable) { + if (shouldBypassConnectable(options)) { headers.set(HEADER_RIVET_BYPASS_CONNECTABLE, "1"); } return headers; diff --git a/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-websocket-client.ts b/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-websocket-client.ts index d4ffe13a0d..9442c988ae 100644 --- a/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-websocket-client.ts +++ b/rivetkit-typescript/packages/rivetkit/src/engine-client/actor-websocket-client.ts @@ -18,7 +18,10 @@ import type { ActorGatewayQuery, CrashPolicy } from "@/client/query"; import type { Encoding, UniversalWebSocket } from "@/mod"; import { encodeCborCompat, uint8ArrayToBase64 } from "@/serde"; import { combineUrlPath } from "@/utils"; -import type { GatewayRequestOptions } from "./driver"; +import { + shouldBypassConnectable, + type GatewayRequestOptions, +} from "./driver"; import { logger } from "./log"; class BufferedRemoteWebSocket implements UniversalWebSocket { @@ -269,7 +272,7 @@ export function buildActorQueryGatewayUrl( if (token !== undefined) { params.append("rvt-token", token); } - if (options.bypassConnectable) { + if (shouldBypassConnectable(options)) { params.append("rvt-bypass_connectable", "true"); } @@ -392,7 +395,7 @@ export function buildWebSocketProtocols( protocols.push(`${WS_PROTOCOL_TARGET}${target.target}`); protocols.push(`${WS_PROTOCOL_ACTOR}${target.actorId}`); } - if (options.bypassConnectable) { + if (shouldBypassConnectable(options)) { protocols.push(WS_PROTOCOL_BYPASS_CONNECTABLE); } if (params) { diff --git a/rivetkit-typescript/packages/rivetkit/src/engine-client/driver.ts b/rivetkit-typescript/packages/rivetkit/src/engine-client/driver.ts index c2c65c6264..83919c570d 100644 --- a/rivetkit-typescript/packages/rivetkit/src/engine-client/driver.ts +++ b/rivetkit-typescript/packages/rivetkit/src/engine-client/driver.ts @@ -8,6 +8,13 @@ export type GatewayTarget = { directId: string } | ActorQuery; export interface GatewayRequestOptions { bypassConnectable?: boolean; + skipReadyWait?: boolean; +} + +export function shouldBypassConnectable( + options: GatewayRequestOptions = {}, +): boolean { + return options.bypassConnectable === true || options.skipReadyWait === true; } export interface EngineControlClient { diff --git a/rivetkit-typescript/packages/rivetkit/src/engine-client/mod.ts b/rivetkit-typescript/packages/rivetkit/src/engine-client/mod.ts index 5bfb7dfe5a..ef6eb41aea 100644 --- a/rivetkit-typescript/packages/rivetkit/src/engine-client/mod.ts +++ b/rivetkit-typescript/packages/rivetkit/src/engine-client/mod.ts @@ -9,6 +9,7 @@ import { } from "@/common/actor-router-consts"; import { noopNext } from "@/common/utils"; import type { Actor as ApiActor } from "@/engine-api/actors"; +import { shouldBypassConnectable } from "@/engine-client/driver"; import type { ActorOutput, CreateInput, @@ -264,7 +265,7 @@ export class RemoteEngineControlClient implements EngineControlClient { ); const httpOptions = { ...options, - directActorId: options.bypassConnectable + directActorId: shouldBypassConnectable(options) ? directActorIdFromTarget(target) : undefined, }; @@ -299,7 +300,7 @@ export class RemoteEngineControlClient implements EngineControlClient { params, { ...options, - directActorId: options.bypassConnectable + directActorId: shouldBypassConnectable(options) ? directActorIdFromTarget(target) : undefined, }, @@ -424,7 +425,7 @@ export class RemoteEngineControlClient implements EngineControlClient { const endpoint = getEndpoint(this.#config); if ( - options.bypassConnectable && + shouldBypassConnectable(options) && directActorIdFromTarget(target) && canUseDirectBypassPath(path) ) { diff --git a/rivetkit-typescript/packages/rivetkit/src/registry/index.ts b/rivetkit-typescript/packages/rivetkit/src/registry/index.ts index 20fecb02a9..96e2e73c7c 100644 --- a/rivetkit-typescript/packages/rivetkit/src/registry/index.ts +++ b/rivetkit-typescript/packages/rivetkit/src/registry/index.ts @@ -51,6 +51,22 @@ export class Registry { this.#config = config; } + #ensureServerlessPoolConfigured(config: RegistryConfig): Promise | undefined { + if (!config.configurePool) return undefined; + + if (!this.#configureServerlessPoolPromise) { + this.#configureServerlessPoolPromise = configureServerlessPool(config).catch( + (error) => { + this.#configureServerlessPoolPromise = undefined; + throw error; + }, + ); + this.#configureServerlessPoolPromise.catch(() => {}); + } + + return this.#configureServerlessPoolPromise; + } + /** * Handle an incoming HTTP request for serverless deployments. * @@ -65,17 +81,42 @@ export class Registry { const config = this.parseConfig(); this.#printWelcome(config, "serverless"); - if (config.configurePool && !this.#configureServerlessPoolPromise) { - this.#configureServerlessPoolPromise = - configureServerlessPool(config); - } - if (!this.#runtimeServerlessPromise) { this.#runtimeServerlessPromise = buildConfiguredRegistry(config); } const { runtime, registry, serveConfig } = await this.#runtimeServerlessPromise; + const isStartRequest = isServerlessStartRequest( + request, + serveConfig.serverlessBasePath ?? "/api/rivet", + ); + const isMetadataRequest = isServerlessMetadataRequest( + request, + serveConfig.serverlessBasePath ?? "/api/rivet", + ); + const isEngineMetadataRequest = + request.headers.get("user-agent")?.startsWith("RivetEngine/") ?? false; + + if (isStartRequest) { + try { + await this.#ensureServerlessPoolConfigured(config); + } catch (error) { + return new Response( + JSON.stringify({ + group: "guard", + code: "service_unavailable", + message: "Serverless pool is not configured.", + metadata: null, + }), + { + status: 503, + headers: { "content-type": "application/json" }, + }, + ); + } + } + const cancelToken = runtime.createCancellationToken(); const abort = () => runtime.cancelCancellationToken(cancelToken); if (request.signal.aborted) { @@ -86,10 +127,7 @@ export class Registry { const requestBody = await request.arrayBuffer(); if ( - isServerlessStartRequest( - request, - serveConfig.serverlessBasePath ?? "/api/rivet", - ) && + isStartRequest && requestBody.byteLength > serveConfig.serverlessMaxStartPayloadBytes ) { request.signal.removeEventListener("abort", abort); @@ -202,6 +240,25 @@ export class Registry { throw err; } + if (isMetadataRequest && !isEngineMetadataRequest) { + try { + await this.#ensureServerlessPoolConfigured(config); + } catch (error) { + return new Response( + JSON.stringify({ + group: "guard", + code: "service_unavailable", + message: "Serverless pool is not configured.", + metadata: null, + }), + { + status: 503, + headers: { "content-type": "application/json" }, + }, + ); + } + } + return new Response(stream, { status: head.status, headers: head.headers, @@ -459,6 +516,14 @@ function isServerlessStartRequest(request: Request, basePath: string): boolean { return parsed.pathname === `${normalizedBase}/start`; } +function isServerlessMetadataRequest(request: Request, basePath: string): boolean { + if (request.method !== "GET") return false; + const parsed = new URL(request.url); + const normalizedBase = + basePath === "/" ? "" : `/${basePath.replace(/^\/+|\/+$/g, "")}`; + return parsed.pathname === `${normalizedBase}/metadata`; +} + export function setup( input: RegistryConfigInput, ): Registry { diff --git a/rivetkit-typescript/packages/rivetkit/src/serverless/configure.ts b/rivetkit-typescript/packages/rivetkit/src/serverless/configure.ts index 63baf10317..57aaa87fb1 100644 --- a/rivetkit-typescript/packages/rivetkit/src/serverless/configure.ts +++ b/rivetkit-typescript/packages/rivetkit/src/serverless/configure.ts @@ -3,66 +3,107 @@ import { getDatacenters, updateRunnerConfig, } from "@/engine-client/api-endpoints"; +import { stringifyError } from "@/common/utils"; import type { RegistryConfig } from "@/registry/config"; import { logger } from "@/registry/log"; +const DEFAULT_CONFIGURE_TIMEOUT_MS = 60_000; +const CONFIGURE_RETRY_DELAY_MS = 1_000; + +function sleep(ms: number): Promise { + return new Promise((resolve) => setTimeout(resolve, ms)); +} + +function configureTimeoutMs() { + const value = process.env.RIVET_SERVERLESS_CONFIGURE_TIMEOUT_MS; + if (value === undefined || value === "") return DEFAULT_CONFIGURE_TIMEOUT_MS; + + const parsed = Number(value); + if (!Number.isFinite(parsed) || parsed < 0) { + throw new Error("RIVET_SERVERLESS_CONFIGURE_TIMEOUT_MS must be a finite non-negative number"); + } + + return parsed; +} + export async function configureServerlessPool( config: RegistryConfig, ): Promise { logger().debug({ msg: "configuring serverless pool" }); - try { - if (!config.namespace) { - throw new Error("namespace is required for serverless configuration"); - } - if (!config.endpoint) { - throw new Error("endpoint is required for serverless configuration"); - } - if (!config.configurePool) { - throw new Error("configurePool is required for serverless configuration"); - } + const startedAt = Date.now(); + const timeoutMs = configureTimeoutMs(); + let attempts = 0; + let lastError: unknown; - const customConfig = config.configurePool; - const clientConfig = convertRegistryConfigToClientConfig(config); - const dcsRes = await getDatacenters(clientConfig); - const poolName = customConfig.name ?? "default"; - const headers = { - ...(config.token ? { "x-rivet-token": config.token } : {}), - ...(customConfig.headers ?? {}), - }; - const serverlessConfig = { - serverless: { - url: customConfig.url, - headers, - request_lifespan: customConfig.requestLifespan ?? 15 * 60, - drain_grace_period: customConfig.drainGracePeriod, - metadata_poll_interval: - customConfig.metadataPollInterval ?? 1000, - max_runners: 100_000, - min_runners: 0, - runners_margin: 0, - slots_per_runner: 1, - }, - metadata: customConfig.metadata ?? {}, - drain_on_version_upgrade: - customConfig.drainOnVersionUpgrade ?? true, - }; + while (Date.now() - startedAt <= timeoutMs) { + attempts += 1; + try { + if (!config.namespace) { + throw new Error("namespace is required for serverless configuration"); + } + if (!config.endpoint) { + throw new Error("endpoint is required for serverless configuration"); + } + if (!config.configurePool) { + throw new Error("configurePool is required for serverless configuration"); + } - await updateRunnerConfig(clientConfig, poolName, { - datacenters: Object.fromEntries( - dcsRes.datacenters.map((dc) => [dc.name, serverlessConfig]), - ), - }); + const customConfig = config.configurePool; + const clientConfig = convertRegistryConfigToClientConfig(config); + const dcsRes = await getDatacenters(clientConfig); + const poolName = customConfig.name ?? "default"; + const serverlessToken = config.token ?? config.publicToken; + const headers = { + ...(serverlessToken ? { "x-rivet-token": serverlessToken } : {}), + ...(customConfig.headers ?? {}), + }; + const serverlessConfig = { + serverless: { + url: customConfig.url, + headers, + request_lifespan: customConfig.requestLifespan ?? 15 * 60, + drain_grace_period: customConfig.drainGracePeriod, + metadata_poll_interval: + customConfig.metadataPollInterval ?? 1000, + max_runners: 100_000, + min_runners: 0, + runners_margin: 0, + slots_per_runner: 1, + }, + metadata: customConfig.metadata ?? {}, + drain_on_version_upgrade: + customConfig.drainOnVersionUpgrade ?? true, + }; - logger().info({ - msg: "serverless pool configured successfully", - poolName, - namespace: config.namespace, - }); - } catch (error) { - logger().error({ - msg: "failed to configure serverless pool, validate endpoint is configured correctly then restart this process", - error, - }); + await updateRunnerConfig(clientConfig, poolName, { + datacenters: Object.fromEntries( + dcsRes.datacenters.map((dc) => [dc.name, serverlessConfig]), + ), + }); + + logger().info({ + msg: "serverless pool configured successfully", + poolName, + namespace: config.namespace, + attempts, + }); + return; + } catch (error) { + lastError = error; + logger().warn({ + msg: "serverless pool configuration attempt failed", + attempts, + error: stringifyError(error), + }); + await sleep(CONFIGURE_RETRY_DELAY_MS); + } } + + logger().error({ + msg: "failed to configure serverless pool, validate endpoint is configured correctly then restart this process", + attempts, + error: stringifyError(lastError), + }); + throw lastError; } diff --git a/rivetkit-typescript/packages/rivetkit/tests/actor-gateway-url.test.ts b/rivetkit-typescript/packages/rivetkit/tests/actor-gateway-url.test.ts index e198e3b3fb..1aec332f61 100644 --- a/rivetkit-typescript/packages/rivetkit/tests/actor-gateway-url.test.ts +++ b/rivetkit-typescript/packages/rivetkit/tests/actor-gateway-url.test.ts @@ -7,6 +7,7 @@ import { import { buildActorGatewayUrl, buildActorQueryGatewayUrl, + buildWebSocketProtocols, } from "@/engine-client/actor-websocket-client"; import { toBase64Url } from "./test-utils"; @@ -56,7 +57,30 @@ describe("gateway URL builders", () => { expect(url).not.toContain("@"); }); - test("serializes gateway bypass for query routing", () => { + test("serializes skipReadyWait for query routing", () => { + const url = buildActorQueryGatewayUrl( + "https://api.rivet.dev/manager", + "prod", + { + getForKey: { + name: "room", + key: ["alpha"], + }, + }, + undefined, + "/status", + undefined, + undefined, + undefined, + { skipReadyWait: true }, + ); + + expect(new URL(url).searchParams.get("rvt-bypass_connectable")).toBe( + "true", + ); + }); + + test("serializes bypassConnectable for query routing", () => { const url = buildActorQueryGatewayUrl( "https://api.rivet.dev/manager", "prod", @@ -79,6 +103,19 @@ describe("gateway URL builders", () => { ); }); + test("serializes bypassConnectable for websocket protocols", () => { + const protocols = buildWebSocketProtocols( + ClientConfigSchema.parse({ endpoint: "https://api.rivet.dev" }), + "json", + undefined, + undefined, + { target: "actor", actorId: "actor-1" }, + { bypassConnectable: true }, + ); + + expect(protocols).toContain("rivet_bypass_connectable"); + }); + test("serializes getOrCreate queries with rvt-* params", () => { const input = { hello: "world" }; const url = buildActorQueryGatewayUrl( diff --git a/website/src/content/docs/actors/lifecycle.mdx b/website/src/content/docs/actors/lifecycle.mdx index 8e0e67ba62..e565fe07ee 100644 --- a/website/src/content/docs/actors/lifecycle.mdx +++ b/website/src/content/docs/actors/lifecycle.mdx @@ -761,6 +761,12 @@ curl -X POST \ `/sleep` asks the actor to enter the normal sleep shutdown sequence. `/reschedule` asks the platform to allocate the actor again, which is useful after crashes or when you need to force a fresh placement. Both endpoints require the actor ID and namespace. +### Skip Ready Wait + +The gateway normally holds requests until the actor is ready. The actor is not ready during startup (before `onWake` finishes) or during the sleep grace period (while `onSleep` and `waitUntil` are running). Probes and readiness checks can opt out with `gateway.skipReadyWait` to reach the actor's `onRequest` or `onWebSocket` handler in either window. + +See [Skip Ready Wait](/docs/clients/javascript#skip-ready-wait) on the JavaScript client page for usage. + ### Keeping the Actor Awake RivetKit gives you two primitives for holding the actor awake across background work. Both take a `Promise` and differ in how they interact with idle sleep and the grace period. diff --git a/website/src/content/docs/actors/request-handler.mdx b/website/src/content/docs/actors/request-handler.mdx index 1755f448f4..efb62d293e 100644 --- a/website/src/content/docs/actors/request-handler.mdx +++ b/website/src/content/docs/actors/request-handler.mdx @@ -249,6 +249,12 @@ The `onRequest` handler is WinterTC compliant and will work with existing librar - Does not support streaming responses & server-sent events at the moment. See the [tracking issue](https://github.com/rivet-dev/rivet/issues/3529). - `OPTIONS` requests currently are handled by Rivet and are not passed to `onRequest` +## Advanced + +### Skip Ready Wait + +Requests are normally held at the gateway until the actor is ready. Pass `gateway.skipReadyWait: true` on `handle.fetch()` to deliver immediately, including while the actor is still starting or in the [sleep grace period](/docs/actors/lifecycle#shutdown-sequence). See [Skip Ready Wait](/docs/clients/javascript#skip-ready-wait) for details. + ## API Reference - [`RequestContext`](/typedoc/interfaces/rivetkit.mod.RequestContext.html) - Context for HTTP request handlers diff --git a/website/src/content/docs/actors/websocket-handler.mdx b/website/src/content/docs/actors/websocket-handler.mdx index 5e21c07ed5..a02dce7db7 100644 --- a/website/src/content/docs/actors/websocket-handler.mdx +++ b/website/src/content/docs/actors/websocket-handler.mdx @@ -295,6 +295,10 @@ const myActor = actor({ }); ``` +### Skip Ready Wait + +Connections are normally held at the gateway until the actor is ready. Pass `gateway.skipReadyWait: true` on `handle.webSocket()` to connect immediately, including while the actor is still starting or in the [sleep grace period](/docs/actors/lifecycle#shutdown-sequence). See [Skip Ready Wait](/docs/clients/javascript#skip-ready-wait) for details. + ### Async Handlers The `onWebSocket` handler can be async, allowing you to perform async code before setting up event listeners: diff --git a/website/src/content/docs/clients/javascript.mdx b/website/src/content/docs/clients/javascript.mdx index 4371f7a4ef..c7cd99c169 100644 --- a/website/src/content/docs/clients/javascript.mdx +++ b/website/src/content/docs/clients/javascript.mdx @@ -253,6 +253,31 @@ https://namespace:token@api.rivet.dev You can also pass the endpoint without auth and provide `RIVET_NAMESPACE` and `RIVET_TOKEN` separately. For serverless deployments, use your app's `/api/rivet` URL. See [Endpoints](/docs/general/endpoints#url-auth-syntax) for details. +## Advanced + +### Skip Ready Wait + +Requests are normally held at the gateway until the actor is ready to accept traffic. An actor is not ready while it's still starting (before `onWake` finishes) or while it's in the [sleep grace period](/docs/actors/lifecycle#shutdown-sequence) (running `onSleep`, `waitUntil`, and pending disconnects). + +Pass `gateway.skipReadyWait: true` on the [low-level HTTP and WebSocket APIs](#low-level-http--websocket) to deliver immediately and reach the actor's `onRequest` / `onWebSocket` handler in either window: + +```ts @nocheck +import { createClient } from "rivetkit/client"; + +const client = createClient(); +const handle = client.chatRoom.getOrCreate(["general"]); + +const response = await handle.fetch("/healthz", { + gateway: { skipReadyWait: true }, +}); + +const ws = await handle.webSocket("probe", undefined, { + gateway: { skipReadyWait: true }, +}); +``` + +Requests still return a transient `actor.stopping` lifecycle error (`{"group":"actor","code":"stopping","message":"Actor is stopping."}`) if the actor has fully stopped, i.e. the sleep grace period has ended but it has not yet restarted. Retry once the actor is available again. + ## API Reference **Package:** [rivetkit](https://www.npmjs.com/package/rivetkit)