feat(api): track Durable Object instances with storage usage#1523
Merged
feat(api): track Durable Object instances with storage usage#1523
Conversation
Rename workspace ID from 'tab-manager' to 'epicenter.tab-manager' to match the documented epicenter.<app> naming convention used by other workspaces.
Exposes ctx.storage.deleteAll() as an RPC method for cleaning up orphaned or renamed Durable Object rooms.
- Add durableObjectInstance table with composite PK (userId, doType, resourceName) - Add unique index on doName, index on userId - Add durableObjectInstanceRelations and update userRelations - Generated migration: 0001_striped_silverclaw.sql
- Modify sync() and getDoc() to return { diff/data, storageBytes } via ctx.storage.sql.databaseSize
- Add afterResponse queue pattern in DB middleware to drain upserts before client.end()
- Add upsertDoInstance helper with INSERT ON CONFLICT for fire-and-forget tracking
- Update all 4 workspace/document route handlers to destructure new RPC shapes and push upserts
- WebSocket upgrades track lastAccessedAt only; HTTP paths include storageBytes
Add review section with summary, deviations, and follow-up work.
# Conflicts: # apps/tab-manager/src/lib/workspace.ts
Update upsert doName constructions to include the type segment,
matching the `user:{userId}:{type}:{name}` convention.
…quest lifecycle Remove cleanup parameter from drain() — callers chain .then() instead. Drop explicit Promise<unknown> return type from upsertDoInstance (inferred). Add JSDoc explaining the unknown typing contract and step-by-step comments documenting the pg.Client lifetime through waitUntil.
Add exported DoType discriminator to schema and apply $type<DoType>() branding on the doType column. Replace uniqueIndex with .unique() on doName (simpler Drizzle idiom). Regenerate migration and update spec to match simplified drain() API.
The composite PK (userId, doType, resourceName) already starts with userId, so Postgres uses it for any userId prefix query. The separate index was dead weight — duplicate B-tree costing writes for zero query benefit.
doName already encodes userId + doType + resourceName, making the composite PK redundant. Single-column PK simplifies the upsert conflict target from 3 columns to 1. userId index added for FK cascade performance and user-scoped queries.
Covers the afterResponse queue pattern, waitUntil lifecycle, and why Promise<unknown> is the right fire-and-forget contract. Uses real code from the Epicenter API.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why this exists
Every Epicenter user gets their own Durable Objects—workspace rooms for structured metadata, document rooms for content with version history. Until now, the API had no visibility into which DOs exist per user or how much storage they consume. We need this for a user dashboard ("here are your workspaces and their sizes") and eventually for usage-based billing.
The challenge: we can't query Cloudflare for a list of DO instances or their storage externally. The only way to measure storage is from inside the DO via
ctx.storage.sql.databaseSize. So we piggyback on existing RPC calls that are already happening.How it works
The Worker already calls
stub.sync()andstub.getDoc()on every request. We addedstorageBytesto the return value (read fromctx.storage.sql.databaseSizeinside the DO—zero extra cost) and fire a non-blocking upsert to Postgres via theafterResponsequeue. The upsert completes before the DB connection closes but doesn't block the HTTP response to the client.The
afterResponsepatterncreateAfterResponseQueue()encapsulates the promise collection withpush()anddrain()methods.drain()settles all queued promises viaPromise.allSettled, and cleanup (closing the pg connection) is chained by the caller via.then()—drain()itself takes no parameters. ThePromise<unknown>typing is the semantic contract for fire-and-forget: we track promises to completion but never inspect what they resolve to.DO naming convention alignment
This PR builds on the naming convention change from #1520 (
user:{userId}:{type}:{name}). The 6 upsert calls in route handlers constructdoNamewith the type segment, matchinggetWorkspaceStub/getDocumentStub:Schema
Design decisions:
doNameas primary key —doName=user:{userId}:{doType}:{resourceName}, so it already encodes the full identity. A composite PK on the decomposed columns was redundant (two indexes for the same logical key). Single-column PK simplifies the upsert conflict target from 3 columns to 1.userIdindex — needed for FK cascade delete performance and "list all DOs for user X" queries. The PK ondoNamecan't serve prefix queries onuserId.doTypeandresourceNameas data columns — derivable fromdoName, but kept for query convenience (avoids string parsing).DoTypebranded union —type DoType = 'workspace' | 'document'with$type<DoType>()on the column for compile-time safety.storageBytesnullable — WebSocket upgrades updatelastAccessedAtonly (no RPC to measure storage).storageMeasuredAt— distinguishes "never measured" from "measured at time T", since active WebSocket traffic can makelastAccessedAtfresh whilestorageBytesis stale..catch()on upsert means a DB failure doesn't break sync. This is a resource registry, not billing authority.Also in this PR
Workspace ID standardization:
tab-manager→epicenter.tab-manager(clean break, no migration—local-first clients re-sync to new DO).deleteStorage()RPC: New method onBaseSyncRoomfor cleanup of renamed/orphaned DOs.Technical article:
docs/articles/piggyback-storage-tracking-on-existing-rpcs.mddocuments theafterResponsequeue pattern and the piggybacking approach for broader reference.What this is NOT
This is not a billing system. For billing you'd need time-series storage measurements, request/connection counting, and AI token tracking—all separate append-only tables. This is a v1 resource registry for a user dashboard.