Skip to content

Internalize heap-size tracking into Tsavorite#1712

Open
badrishc wants to merge 25 commits intodevfrom
badrishc/replace-subscribe-eviction
Open

Internalize heap-size tracking into Tsavorite#1712
badrishc wants to merge 25 commits intodevfrom
badrishc/replace-subscribe-eviction

Conversation

@badrishc
Copy link
Copy Markdown
Collaborator

@badrishc badrishc commented Apr 17, 2026

Summary

Replaces Garnet's heavyweight SubscribeEvictions observer pattern with Tsavorite-internal heap-size accounting. Previously, Garnet wired LogSizeTracker as an IObserver<ITsavoriteScanIterator> via SubscribeEvictions, which materialized a full page iterator on every eviction to sum heap sizes. Session functions (ISessionFunctions) were responsible for creation-site, destruction-site, and in-place-mutation heap tracking, requiring every ISF implementation to manually call cacheSizeTracker.AddHeapSize() in multiple callbacks — fragile and error-prone.

Now, Tsavorite owns nearly the complete heap-size lifecycle internally via logSizeTracker:

Accounting Who How
+key at record creation Tsavorite CAS success sites (InternalRMW/Upsert/Delete)
+value at record creation Tsavorite CAS success sites, after Post* callbacks populate the value
−value at delete Tsavorite ObjectAllocatorImpl.OnDispose(Deleted), before tombstone is set
−valueObject at CopyUpdate source clear Tsavorite Internal CopyUpdated path + IHeapObject.Dispose()
−(key+value) at eviction Tsavorite EvictRecordsInRange: full heap for non-tombstoned, key-only for tombstoned
±delta for in-place mutations ISF InPlaceUpdater/InPlaceWriter pre/post measurement (guarded by ValueIsInline bit check)

IRecordTriggers callbacks (OnDispose, OnEvict, OnFlush, OnDiskRead, OnDisposeDiskRecord) remain available for app-level resource cleanup (e.g. IDisposable.Dispose on value objects holding external resources). All gate properties default to false via DIM. GarnetRecordTriggers is now an empty struct.

Key Changes

Tsavorite core (libs/storage/Tsavorite/cs/src/core/)

  • Heap accounting at CAS success sites: +key overflow and +value heap after Post* callbacks in InternalRMW, InternalUpsert, InternalDelete, and ReinitializeExpiredRecord
  • Heap accounting at delete: ObjectAllocatorImpl.OnDispose(Deleted) decrements value heap via logSizeTracker before ClearHeapFields; tombstone set AFTER OnDispose so GetValueHeapMemorySize reads pre-tombstone state
  • Heap accounting at eviction: EvictRecordsInRange decrements CalculateHeapMemorySize (key+value) for non-tombstoned records, key-only for tombstoned; visits sealed records (not just valid ones); runs whenever logSizeTracker is set OR CallOnEvict is true
  • CopyUpdated internalized: value-object heap decremented + IHeapObject.Dispose() called internally in InternalRMW post-CAS block, no trigger involvement
  • WrongType fix: SessionFunctionsWrapper.InPlaceUpdater no longer sets tombstone on WrongType — the record stays intact
  • DiskLogRecord cleanup: removed objectDisposer lambda field; callers use OnDisposeDiskRecord trigger then Dispose()
  • IRecordTriggers simplified: all gate properties (CallOnFlush, CallOnEvict, CallOnDiskRead) default to false via DIM; OnDisposeDiskRecord is a DIM; CallOnEvict simplified to parameterless property
  • Sizing fix: LogRecord.GetValueHeapMemorySize uses HeapMemorySize (includes .NET byte array overhead) instead of TotalSize, consistent with CalculateHeapMemorySize

Garnet server (libs/server/)

  • GarnetRecordTriggers: empty struct — removed cacheSizeTracker field, constructor, all heap-accounting code
  • CacheSizeTracker.Initialize: uses SetLogSizeTracker instead of SubscribeEvictions
  • All ISF Post* callbacks: removed +value tracking (now Tsavorite-internal)
  • MainStore InPlaceUpdater: retains ±delta tracking with ValueIsInline bit-check guard (skips GetValueHeapMemorySize for inline values)
  • ObjectStore/UnifiedStore InPlaceUpdater: removed cacheSizeTracker delta calls (Tsavorite handles via pre/post)
  • Cluster migration/replication: OnDisposeDiskRecord + Dispose() with IsSet guard against double-trigger

Tests (libs/storage/Tsavorite/cs/test/)

  • RecordLifecycleTests.cs: 12 new tests covering OnDispose (per reason), OnDisposeDiskRecord, OnEvict, OnFlush/OnDiskRead gating, read-cache eviction, cross-lifecycle double-dispose regression
  • DeleteDisposeTests.cs: updated for removed OnDisposeValueObject

Issues Fixed

  • Heap tracker going negative on RMW expire paths (ExpireAndStop/ExpireAndResume) for object and string records
  • Heap tracker going negative for VectorSet records (missing creation-site tracking in VectorSessionFunctions)
  • Heap tracker going negative for HLL dense overflow records (TotalSize vs HeapMemorySize mismatch — 24-byte .NET array overhead)
  • Key overflow never tracked for session-function-created records (only TryCopyToTail/ReadCache tracked keys)
  • Sealed CopyUpdate source records skipped by eviction, leaking overflow heap
  • WrongType incorrectly tombstoning records (destroying keys on type-mismatch errors)
  • Cluster migration double-triggering OnDisposeDiskRecord
  • CopyUpdated accounting requiring trigger implementers to know partial-clear nuance

Future TODO

The only residual subtlety worth flagging (and pre-existing to this PR) is that a TryCopyToTail produces two records sharing the same IHeapObject via TryCopyFrom. If a subsequent in-place RMW on the dest mutates that shared object (growing from X to Y), CalculateHeapMemorySize at source-evict time will return the current Y, not the originally-added X — so the source's eviction under-/over-decrement scales with post-copy mutation. The old observer-iterator path reads the same field at the same time, so it has exactly the same behavior.

Copilot AI review requested due to automatic review settings April 17, 2026 03:20
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 7531edc to 4311e07 Compare April 17, 2026 03:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates Garnet heap-size accounting on page eviction from the SubscribeEvictions iterator/observer scan path to Tsavorite’s per-record IRecordTriggers.OnEvict callback, while also threading an EvictionSource (main log vs read cache) through the eviction APIs.

Changes:

  • Introduces EvictionSource { MainLog, ReadCache } and threads it through IRecordTriggers.OnEvict, IStoreFunctions.OnEvict, and IAllocator.EvictRecordsInRange.
  • Adds AllocatorSettings.IsReadCache / AllocatorBase.IsReadCache to tag eviction callbacks with the correct source.
  • Updates Garnet CacheSizeTracker wiring to use SetLogSizeTracker (fast-path size tracking) and moves eviction decrements to GarnetRecordTriggers.OnEvict.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
libs/storage/Tsavorite/cs/src/core/Index/Tsavorite/Tsavorite.cs Marks the read-cache allocator via AllocatorSettings.IsReadCache = true.
libs/storage/Tsavorite/cs/src/core/Index/StoreFunctions/StoreFunctions.cs Updates store-functions eviction forwarding to include EvictionSource.
libs/storage/Tsavorite/cs/src/core/Index/StoreFunctions/IStoreFunctions.cs Updates OnEvict signature to include EvictionSource.
libs/storage/Tsavorite/cs/src/core/Index/StoreFunctions/IRecordTriggers.cs Adds EvictionSource enum and updates IRecordTriggers.OnEvict signature/docs.
libs/storage/Tsavorite/cs/src/core/Allocator/TsavoriteLogAllocatorImpl.cs Switches allocator base ctor usage to pass AllocatorSettings directly.
libs/storage/Tsavorite/cs/src/core/Allocator/TsavoriteLogAllocator.cs Updates EvictRecordsInRange signature to include EvictionSource.
libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteAllocatorImpl.cs Switches allocator base ctor usage to pass AllocatorSettings directly.
libs/storage/Tsavorite/cs/src/core/Allocator/SpanByteAllocator.cs Updates EvictRecordsInRange signature to include EvictionSource.
libs/storage/Tsavorite/cs/src/core/Allocator/ObjectAllocatorImpl.cs Threads EvictionSource into per-record eviction callbacks to storeFunctions.OnEvict.
libs/storage/Tsavorite/cs/src/core/Allocator/ObjectAllocator.cs Updates wrapper to pass EvictionSource through EvictRecordsInRange.
libs/storage/Tsavorite/cs/src/core/Allocator/IAllocator.cs Updates EvictRecordsInRange contract to include EvictionSource.
libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorSettings.cs Adds IsReadCache flag for allocator tagging.
libs/storage/Tsavorite/cs/src/core/Allocator/AllocatorBase.cs Stores IsReadCache, passes EvictionSource into eviction calls, and updates recovery eviction path.
libs/server/Storage/SizeTracker/CacheSizeTracker.cs Replaces SubscribeEvictions with SetLogSizeTracker for internal size tracking.
libs/server/Storage/Functions/GarnetRecordTriggers.cs Enables CallOnEvict and implements OnEvict(ref LogRecord, EvictionSource) heap-size decrement routing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libs/server/Storage/Functions/GarnetRecordTriggers.cs Outdated
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch 3 times, most recently from 7c62bf1 to 695d6ab Compare April 17, 2026 05:25
… tracking

Garnet's `CacheSizeTracker` used `LogAccessor.SubscribeEvictions` to drive a
`LogSizeTracker.OnNext` observer that, on every page eviction, allocated a
scan iterator and walked the page to sum `MemoryUtils.CalculateHeapMemorySize`
over each non-null / non-closed record. This is heavyweight for a hot path
(buffer-pool allocation, iterator bookkeeping, epoch resume/suspend, and a
virtual dispatch per page).

This PR migrates the per-page heap-size decrement onto the per-record
`IRecordTriggers.OnEvict` hook introduced in #1695, which the object
allocator already walks during `EvictRecordsInRange`. That collapses the
"scan iterator + observer + sum" path into a single per-record callback
directly on the record we would have visited anyway.

### Tsavorite changes

- Add `EvictionSource { MainLog, ReadCache }` and thread it through
  `IRecordTriggers.OnEvict`, `IStoreFunctions.OnEvict`, and
  `IAllocator.EvictRecordsInRange`. Garnet uses this to route decrements to
  the correct counter (`AddHeapSize` vs `AddReadCacheHeapSize`).
- Change `IRecordTriggers.CallOnEvict` / `IStoreFunctions.CallOnEvict` from
  a property to `CallOnEvict(EvictionSource)` so the allocator can skip the
  per-record eviction walk entirely when the application has no work on
  that side (e.g. a read-cache-only budget should not force walks of the
  main-log allocator).
- Add `AllocatorSettings.IsReadCache` and `AllocatorBase.IsReadCache`, set
  to `true` by `Tsavorite.cs` when constructing the read-cache allocator.
  This is the cleanest way to distinguish the two allocators at
  `OnPagesClosedWorker` time without relying on the `evictCallback`
  sentinel, and is used to pass the correct `EvictionSource` to `CallOnEvict`.
- `AllocatorBase.EvictPageForRecovery` routes through the per-record
  `EvictRecordsInRange` when `storeFunctions.CallOnEvict(source)` is set.
  The legacy `MemoryPageScan(observer)` path is preserved as a fallback
  for consumers that still use `SubscribeEvictions`.
- Collapse `AllocatorBase`'s constructor to accept `AllocatorSettings`
  directly instead of unpacking individual fields at each concrete
  allocator (`ObjectAllocatorImpl`, `SpanByteAllocatorImpl`,
  `TsavoriteLogAllocatorImpl`).
- Tighten `ObjectAllocatorImpl.EvictRecordsInRange` to match
  `ObjectScanIterator`'s single-page invariant: clip `stopAddress` to the
  start page, bail on `offset == 0`, and document that both callers
  (`OnPagesClosedWorker`, `EvictPageForRecovery`) hand single-page ranges.

### Garnet changes

- `GarnetRecordTriggers.CallOnEvict(EvictionSource)` returns true only for
  the sides that are actually configured (`mainLogTracker` and/or
  `readCacheTracker` non-null), avoiding pointless per-record walks on the
  untracked allocator.
- `GarnetRecordTriggers.OnEvict(ref LogRecord, EvictionSource)` computes
  `MemoryUtils.CalculateHeapMemorySize(in logRecord)` and dispatches to
  `AddHeapSize(-size)` or `AddReadCacheHeapSize(-size)` based on the
  source. This goes through the standard asserting path — the counter must
  never undershoot zero.
- `CacheSizeTracker.Initialize` replaces the two `SubscribeEvictions`
  calls with `SetLogSizeTracker` calls so the fast-path size tracking
  during `TryCopyToTail`, `TryCopyToReadCache`, and object-page growth
  (`UpdateSize`, `IncrementSize`) continues to work unchanged.

### MainStore heap-tracking fix (root cause of negative counter)

Routing the eviction decrement through the asserting `AddHeapSize` path
surfaced a pre-existing gap: `MainStore` record-creation paths never
emitted a positive heap-size bump when a record's key or value spilled to
overflow (large inline field backed by a heap-allocated byte array). The
legacy observer path masked this silently because `OnNext` did a raw
decrement with no assertion — the counter quietly went negative on every
HLL sparse→dense transition (≈-12 KB per key) and accumulated drift
elsewhere.

`MainStore` now emits balanced heap accounting at all create/update/delete
entry points that go through a typed function callback rather than the
shared `SessionFunctionsUtils` writer (which already tracks deltas):

- `RMWMethods.PostInitialUpdater`:
  `+logRecord.CalculateHeapMemorySize()` for the freshly-created record.
- `RMWMethods.PostCopyUpdater`: `+dstLogRecord.CalculateHeapMemorySize()`
  for the new record only. The source is not subtracted — the
  `TSourceLogRecord` may be an in-memory main-log record (whose heap is
  tracked in `mainLogTracker` and will leak upward by a bounded sealed-
  source amount, parity with `ObjectStore`'s `ClearSource=false` branch
  and the legacy observer path), a read-cache record (heap lives in
  `readCacheTracker`, unrelated to this write), or a pending-IO
  `DiskLogRecord` (heap never counted in any tracker). Subtracting
  unconditionally would undercount in the last two cases and drive the
  counter negative.
- `RMWMethods.InPlaceUpdater`: `GetValueHeapMemorySize()` pre/post delta
  on `Succeeded`, so value-heap changes (e.g. APPEND, SETRANGE, or any
  path that triggers `ReallocateValueOverflow` /
  `ConvertInlineToOverflow` / `ConvertOverflowToInline`) are tracked.
- `UpsertMethods.PostInitialWriter` (all three overloads):
  `+logRecord.CalculateHeapMemorySize()` for SET-style inserts that
  create a new record through the Upsert path.
- `DeleteMethods.InPlaceDeleter`:
  `-logRecord.CalculateHeapMemorySize()` before the wrapper sets
  Tombstone. After tombstone is set, `CalculateHeapMemorySize` short-
  circuits to zero and `EvictRecordsInRange` skips the record, so
  without this decrement the creation-side increments would leak for
  every DEL of an overflow record.

The `HyperLogLogPFADD_LTM` suite is the regression that first surfaced
this — HLL sparse→dense goes through CopyUpdater and allocates ≈12 KB of
overflow for the dense representation. With the fix the counter stays
balanced and the assertion `heapSize.Total >= 0` holds throughout.

### Parity

Behavior at every record state encountered during page eviction is
bit-for-bit identical to the prior `SubscribeEvictions` path:

| Record state                                     | Old path (iterator)                | New path (per-record)        | Delta |
| ------------------------------------------------ | ---------------------------------- | ---------------------------- | ----- |
| Valid, !Sealed, !Tombstone, !IsNull              | yielded → -CalculateHeapMemorySize | OnEvict → -Calculate...      | same  |
| `IsNull`                                         | skipped by iterator                | skipped by filter            | 0     |
| `SkipOnScan` (Invalid or Sealed)                 | skipped by iterator                | skipped by filter            | 0     |
| `Tombstone` (post-delete, already ValueIsInline) | yielded → 0 via `!Info.Tombstone`  | skipped by filter → 0        | 0     |

### Hot-path cost

Fast paths (SET / GET / INCR on inline-sized values) pay only a few
aggressive-inlined inline-bit checks; `GetValueHeapMemorySize` on an
inline record early-returns 0 before any tracker call, and a
`heap != 0` guard skips the `AddHeapSize` dispatch entirely. Tracker
work occurs only when a record genuinely created, resized, or freed
an overflow/object allocation — which was already a heavyweight event.

### Testing

All on net10.0 Debug:

- `HyperLogLogTests` (incl. `HyperLogLogPFADD_LTM{32,4096}`,
  `HyperLogLogTestPFMERGE_LTM_*`) ✓
- `CacheSizeTrackerTests` ✓
- `RespListTests` (incl. `ListPushPopStressTest` ×10),
  `RespHashTests`, `RespSetTests`, `RespSortedSetTests`,
  `RespEtagTests`, `RespBitmapTests`, `RespTests.Set*`, `RespTests.Del*`:
  517/517 ✓
- Clean build on both `Garnet.slnx` and `Tsavorite.slnx` (0 warnings,
  0 errors), `dotnet format --verify-no-changes` clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 695d6ab to 9e7454c Compare April 17, 2026 06:47
…yUpdater

For in-memory CopyUpdate, Tsavorite's HeapObjectBase.CacheSerializedObjectData
cloned src.ValueObject into dst.ValueObject and Garnet's PostCopyUpdater then
cloned again, overwriting the first clone. Every ObjectStore CopyUpdate was
paying for two Clone() calls and one immediate GC.

Fix by making CacheSerializedObjectData the single point of cloning for all
sources (memory and pending-IO DiskLogRecord), and stripping the redundant
clone in Garnet's PostCopyUpdater.

Tsavorite changes (IHeapObject / HeapObjectBase / InternalRMW):
* Drop the unused 'ref LogRecord srcLogRecord' parameter from
  CacheSerializedObjectData; its only uses were an identity assert and a
  'srcLogRecord.ValueObject'-as-'this' redirect.
* Add an explicit 'bool srcIsOnMemoryLog' parameter so the method can safely
  run for DiskLogRecord sources (which cannot be expressed as ref LogRecord).
  When false, perform the clone and return immediately - skipping the
  serialization state machine, which is meaningless for ephemeral disk
  sources (the (v) data is already persisted on disk from a prior flush,
  'this' is about to be disposed up the pending chain, and
  ClearSourceValueObject is ignored by InternalRMW for non-memory sources).
* In InternalRMW, drop the 'isMemoryLogRecord' guard around the call, so
  cloning is always delegated to CacheSerializedObjectData regardless of
  source kind.

Garnet changes (ObjectStore RMWMethods):
* PostCopyUpdater now reads 'dstLogRecord.ValueObject' directly - the clone
  is always performed upstream. No conditional clone path remains here.

The single surviving clone site still runs after the successful CAS,
preserving CopyUpdater's deferred-allocation-until-CAS invariant.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 2da959d to 1717f9b Compare April 17, 2026 15:32
badrishc and others added 9 commits April 17, 2026 10:14
…sal sites

Before: ObjectStore PostCopyUpdater conditionally emitted +new-old when
rmwInfo.ClearSourceValueObject was true, otherwise +new. This coupled PCU to
a Tsavorite-internal signal and combined two concerns (creation of (v+1)
and removal of (v)) into one branching call site.

After: mirror MainStore's pattern by splitting into two unconditional sites.
* PostCopyUpdater always emits +value.HeapMemorySize for the (v+1) creation.
  Matches PostInitialUpdater / PostInitialWriter already doing +new.heap.
* OnDisposeValueObject(DisposeReason.CopyUpdated) emits -valueObject.HeapMemorySize
  for the (v) removal. This callback fires from InternalRMW.ClearValueIfHeap
  exactly when the source is cleared eagerly (ClearSourceValueObject=true and
  isMemoryLogRecord), which is the literal 'source freed now' signal.

Checkpoint/disk paths that leave the source alive do not reach the dispose
site; their decrement is emitted later by OnEvict when the sealed source
page evicts (via CalculateHeapMemorySize, which includes ValueObject.HeapMemorySize).
Net tracker state is bit-for-bit identical to the previous conditional form,
but each site now has a single arithmetic direction and no awareness of
Tsavorite's internal ClearSourceValueObject flag.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Simplify the record-lifecycle contract for IHeapObject values:
- Remove IRecordTriggers.OnDisposeValueObject; OnDispose(ref LogRecord, reason)
  is now the single app-facing hook. Garnet's impl merges the Deleted and
  CopyUpdated branches into one heap-size decrement.
- Drop the Action<IHeapObject> disposer lambda threaded through
  ObjectIdMap.Free, LogField.ClearObjectIdAndConvertToInline,
  LogRecord.ClearValueIfHeap/ClearHeapFields/Dispose, DiskLogRecord ctors,
  PendingContext.CopyFrom, ConditionalCopyToTail, scan iterators, and
  ISourceLogRecord.ClearValueIfHeap.
- ObjectIdMap.Free(int) is now strictly slot reclamation; it never disposes.
- DiskLogRecord.Dispose() calls ValueObject?.Dispose() itself, covering all
  its owners (PendingContext, AsyncIOContext, scan iterators, cluster
  migration / replication streaming, deserialized-from-disk).
- InternalRMW CopyUpdated path routes through
  storeFunctions.OnDispose(ref srcMemLogRecord, CopyUpdated) followed by
  srcMemLogRecord.ClearValueIfHeap().
- ObjectAllocatorImpl.OnDispose(ref DiskLogRecord) becomes a no-op.
- Update DeleteDisposeTests to the new single-hook surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Eliminates the asymmetry where DiskLogRecord.Dispose() unconditionally called
ValueObject.Dispose() while on-log LogRecord disposal routed through the
IRecordTriggers.OnDispose trigger. All record disposal now goes through
IRecordTriggers, giving Garnet full control over whether to dispose the
value object.

- Add IRecordTriggers.OnDisposeDiskRecord(ref DiskLogRecord, DisposeReason)
  as a required interface member with explicit no-op overrides in
  DefaultRecordTriggers, SpanByteRecordTriggers, and GarnetRecordTriggers
  (avoids DIM/interface-dispatch overhead via JIT monomorphization).
- Forward through IStoreFunctions/StoreFunctions to IAllocator, renamed
  IAllocator.OnDispose(ref DiskLogRecord) -> OnDisposeDiskRecord.
  ObjectAllocatorImpl forwards to storeFunctions (was no-op);
  SpanByteAllocator stays no-op (no IHeapObject); TsavoriteLogAllocator
  throws NotImplementedException.
- Remove ValueObject?.Dispose() from DiskLogRecord.Dispose() — callers
  now fire the trigger before Dispose. Fixes latent bug in scan-iterator
  DiskLogRecord wrappers that shared ValueObject references with the
  still-alive on-log record.
- Fire the trigger at all DiskLogRecord disposal sites:
  AllocatorBase (happy-path + retry/catch), AllocatorScan.GetFromDiskAndPushToReader,
  TsavoriteThread.InternalCompletePendingRequestFromContext, and the four
  cluster migrate/replication command sites.
- Update test IRecordTriggers impls (TrackingRecordTriggers,
  ObjTrackingRecordTriggers) to satisfy the new interface member.

All 102 relevant Tsavorite tests and 472 Garnet object-store tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…pose accounting

Adds 12 tests that precisely assert when and how many times each IRecordTriggers
callback fires across the full record lifecycle. Complements the existing
DeleteDisposeTests with coverage of the new OnDisposeDiskRecord hook and the
refactored DisposeReason / EvictionSource discriminators introduced in this PR.

Trackers:
- TrackedObjectValue (subclass of TestObjectValue) counts per-instance
  IHeapObject.Dispose invocations so tests can catch double-dispose across any
  combination of on-log, scan, pending-read, and eviction paths.
- LifecycleTracker counts OnDispose/OnDisposeDiskRecord per DisposeReason,
  OnEvict per EvictionSource, plus OnFlush / OnDiskRead — with toggleable
  CallOn* flags so tests can verify gating.

Covered behaviours:
- OnDispose(CopyUpdated) fires exactly once on immutable RMW; value dispose
  count is exactly 1 when handler opts in.
- OnDispose(Deleted) fires exactly once on delete; no OnDisposeDiskRecord.
- In-memory scan fires OnDisposeDiskRecord(DeserializedFromDisk) exactly N
  times for N records and does NOT dispose shared value objects.
- Disk scan fires the same hook exactly K times and disposes exactly K values
  when opt-in, with no double-dispose.
- Pending read from disk fires exactly one OnDisposeDiskRecord, no OnDispose,
  no OnEvict.
- Page eviction fires OnEvict(MainLog) exactly (live - tombstoned) times with
  no re-firing of OnDispose(Deleted) for tombstones.
- CallOnFlush / CallOnDiskRead / per-source CallOnEvict gating fully suppress
  their respective callbacks when false.
- Read cache eviction fires OnEvict(ReadCache) and not OnEvict(MainLog).
- No value object is ever disposed more than once across an upsert → delete →
  re-upsert → evict → pending-read cycle.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The earlier refactor in eeb6a3b split ObjectStore PCU heap-size tracking into
two sites (PCU emits +new unconditionally, OnDispose(CopyUpdated) emits -old
when the source is cleared eagerly). The same split was needed in UnifiedStore,
but its PostCopyUpdater was missed.

Result: on an EXPIRE/PERSIST that went through CopyUpdate for an object record
(e.g. SADD followed by EXPIRE), UnifiedStore.PCU still emitted +new - old while
GarnetRecordTriggers.OnDispose(CopyUpdated) also emitted -old, double-decrementing
the tracked heap size. A subsequent OnDispose(Deleted) from ReinitializeExpiredRecord
then drove the counter below zero and tripped the Debug.Assert in
LogSizeTracker.IncrementSize (HeapSize.Total should be >= 0).

Fix: mirror ObjectStore — PCU emits +new.HeapMemorySize unconditionally, and the
matching -old is emitted at the removal site (OnDispose(CopyUpdated) or OnEvict
for checkpoint/disk paths).

Fixes the four CI failures surfaced on the replace-subscribe-eviction branch:
- RespTests.ReAddExpiredKey
- RespTransactionProcTests.TransactionObjectExpiryProcTest
- RespCustomCommandTests.CustomObjectCommandTest2
- RespSortedSetGeoTests.CanUseGeoSearchStoreWithDeleteKeyWhenSourceNotFound

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…cord disposal

Three fixes from independent design reviews:

1. GarnetRecordTriggers.OnDispose(Deleted): use CalculateHeapMemorySize instead
   of only ValueObject.HeapMemorySize. This ensures overflow key/value bytes on
   MainStore string records are correctly decremented on RMW expire paths
   (ExpireAndResume, ExpireAndStop) where InPlaceDeleter does not run.
   CalculateHeapMemorySize returns 0 for tombstoned records, so this is a
   natural no-op on the mutable Delete() path where MainStore.InPlaceDeleter
   already subtracted before tombstone was set.

2. Cluster migration/replication: prevent double-trigger of OnDisposeDiskRecord
   by resetting diskLogRecord to default after the normal-path dispose, and
   guarding the finally/catch block with IsSet. Previously, a non-noop trigger
   implementation could receive a callback on a default/already-disposed record.

3. IRecordTriggers.OnDisposeDiskRecord: make it a default interface method
   (no-op) instead of requiring explicit implementation. This reduces the
   implementation burden on consumers who don't need disk-record lifecycle hooks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…heap leak

EvictRecordsInRange previously skipped all SkipOnScan records (Sealed OR
Invalid). This leaked tracked heap for sealed-but-valid source records from
mutable-region CopyUpdates and immutable-region deletes, whose overflow
key/value bytes were never decremented from the tracker.

Fix: skip only Invalid records (already disposed/elided) and Tombstoned
records (already decremented at delete site). Sealed-but-Valid records are
now visited by OnEvict, which correctly picks up their remaining heap
contribution (overflow bytes, and value objects kept alive during checkpoint).

For records where OnDispose(Deleted) already cleared the value (immutable
delete), CalculateHeapMemorySize returns only key overflow (if any) — no
double-decrement because OnDispose decremented the full amount when the
record was not yet tombstoned, and ClearHeapFields(clearKey=false) zeroed
the value slot afterward. OnEvict sees only what remains (key overflow if
present, typically 0 for inline keys).

Updated PageEvictionFiresOnEvictForEveryLiveRecord test to use bounds
assertions reflecting that sealed source records from immutable-region
deletes are now correctly visited.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e state

Session functions (ISessionFunctions) should not be responsible for heap-size
tracking on destruction paths — that's IRecordTriggers' job. Two places were
setting Tombstone BEFORE OnDispose(Deleted) fired, causing CalculateHeapMemorySize
to return 0 and preventing the trigger from decrementing the tracked heap:

1. SessionFunctionsWrapper.InPlaceDeleter: set Tombstone+Dirty before returning
   to InternalDelete, which then called OnDispose(Deleted). Fixed by moving
   SetTombstone+SetDirtyAndModified to InternalDelete after OnDispose.

2. SessionFunctionsWrapper.InPlaceUpdater (ExpireAndStop): set Tombstone before
   returning to InternalRMW, which then called OnDispose(Deleted). Fixed by
   moving SetTombstone+SetDirtyAndModified to InternalRMW after OnDispose.

This removes the manual cacheSizeTracker decrement from MainStore.InPlaceDeleter
(it was a workaround for the wrong ordering) and makes GarnetRecordTriggers.OnDispose
the single destruction-side decrement path for all record types.

Also tightened PageEvictionFiresOnEvictForEveryLiveRecord test to use precise
counts by observing TailAddress movement to distinguish mutable vs immutable deletes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from c6d6dd9 to 61089a1 Compare April 18, 2026 00:32
The CopyUpdated path is a partial clear (value-object slot only, key stays on
the record). Expecting IRecordTriggers implementers to know this nuance — that
they should subtract only ValueObject.HeapMemorySize and not the full
CalculateHeapMemorySize — is error-prone.

Move the accounting entirely into Tsavorite's InternalRMW:
- Decrement logSizeTracker by ValueObject.HeapMemorySize before clearing
- Call IHeapObject.Dispose() on the freed value object
- Then ClearValueIfHeap() nulls the ObjectIdMap slot

Remove OnDispose(CopyUpdated) from the trigger call site — the trigger is no
longer involved in CopyUpdate accounting. GarnetRecordTriggers.OnDispose now
only handles DisposeReason.Deleted. The IRecordTriggers doc is updated to
reflect that CopyUpdated is handled internally.

The trigger contract is now simple:
- OnDispose(Deleted): subtract CalculateHeapMemorySize (full record heap)
- OnEvict: subtract CalculateHeapMemorySize (whatever remains at eviction)
- Creation sites (PostInitialUpdater/Writer/CopyUpdater, InPlaceUpdater):
  add the new/changed heap

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 957c5c4 to 89cbce5 Compare April 18, 2026 01:18
badrishc and others added 2 commits April 17, 2026 18:31
- EvictRecordsInRange: update XML doc and skip-condition comments to reflect
  that sealed records are now visited (not skipped) and InPlaceDeleter no
  longer does heap tracking
- GarnetRecordTriggers.OnDispose: simplify comment now that tombstone is
  always set AFTER OnDispose in all paths
- Remove legacy SubscribeEvictions reference from EvictRecordsInRange doc

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…w double-subtract

OnDispose(Deleted) now uses GetValueHeapMemorySize() instead of
CalculateHeapMemorySize(). This prevents double-subtracting key overflow
bytes for immutable-region deletes where:
1. OnDispose(Deleted) fires on the source (subtracts full heap including key)
2. ClearHeapFields(clearKey=false) keeps key alive for chain traversal
3. Source is sealed (not tombstoned) — OnEvict visits it later
4. CalculateHeapMemorySize returns key overflow → subtracted again

With the fix:
- OnDispose(Deleted) subtracts value-only heap (value overflow + value object)
- For mutable deletes: tombstoned → OnEvict skips → key overflow is a bounded
  phantom freed by GC when the page is freed (same as pre-IRecordTriggers)
- For immutable deletes: sealed → OnEvict visits → subtracts key overflow once
- Net: no double-decrement, bounded key-overflow phantom on mutable deletes

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 138cced to 53389b6 Compare April 18, 2026 04:07
@badrishc
Copy link
Copy Markdown
Collaborator Author

badrishc commented Apr 18, 2026

Heap-Size Accounting: Record Lifecycle Trace

Heap-size tracking is split between Tsavorite (record creation, deletion, eviction, CopyUpdate source clearing) and session functions (in-place mutation deltas only). IRecordTriggers callbacks are not involved in accounting.

Accounting Model

Tsavorite-internal tracking (via logSizeTracker):

Site What How
CAS success in InternalRMW +key KeyOverflow.HeapMemorySize if key is overflow
CAS success in InternalRMW +value GetValueHeapMemorySize() after PostInitialUpdater or PostCopyUpdater
CAS success in InternalUpsert +key KeyOverflow.HeapMemorySize if key is overflow
CAS success in InternalUpsert +value GetValueHeapMemorySize() after PostInitialWriter
CAS success in InternalDelete +key KeyOverflow.HeapMemorySize if key is overflow (tombstone record)
ReinitializeExpiredRecord (IPU) +value GetValueHeapMemorySize() after PostInitialUpdater
ObjectAllocatorImpl.OnDispose(Deleted) −value GetValueHeapMemorySize() before tombstone is set
CopyUpdate source clear (ClearSourceValueObject) −valueObject ValueObject.HeapMemorySize + Dispose()
EvictRecordsInRange (non-tombstoned) −(key+value) CalculateHeapMemorySize()
EvictRecordsInRange (tombstoned) −key KeyOverflow.HeapMemorySize
TryCopyToTail +(key+value) UpdateSize(add: true)CalculateHeapMemorySize()
TryCopyToReadCache +(key+value) UpdateSize(add: true)CalculateHeapMemorySize()
Recovery page read (objects) +(key+value) UpdateSize(add: true) for deserialized records

ISF tracking (via cacheSizeTracker.AddHeapSize):

Site What How
MainStore InPlaceUpdater ±delta GetValueHeapMemorySize() pre/post, guarded by ValueIsInline bit check
ObjectStore InPlaceUpdater ±delta sizeChange from IGarnetObject.Operate()
UnifiedStore InPlaceUpdater ±delta sizeChange from IGarnetObject.Operate() (object records only)
SessionFunctionsUtils.InPlaceWriter* ±delta GetValueHeapMemorySize() pre/post

Note: cacheSizeTracker.AddHeapSize routes to mainLogTracker.IncrementSize which is the same logSizeTracker instance — all increments and decrements hit the same counter.

Lifecycle Trace

Event Tsavorite internal ISF
Mutable Delete CAS: +key (tombstone). OnDispose(Deleted): −value. Tombstone set after. Eviction: −key.
Immutable Delete CAS: +key (tombstone at tail). OnDispose(Deleted) on source: −value. Source sealed. Eviction of source: −key. Eviction of tombstone: −key.
RMW ExpireAndStop IPU OnDispose(Deleted): −value. Tombstone set after. Eviction: −key.
RMW ExpireAndResume IPU OnDispose(Deleted): −value. PostInitialUpdater runs, then +value. Eviction: −(key+value).
RMW ExpireAndStop NCU (elidable) OnDispose(Deleted): −value. Tombstone set. Elided → freelist clears key. Invalid → eviction skips.
RMW ExpireAndStop NCU (non-elidable) OnDispose(Deleted) on source: −value. CAS: +key (tombstone at tail). Source eviction: −key. Tombstone eviction: −key.
RMW ExpireAndResume NCU/CU OnDispose(Deleted) on source: −value. CAS: +key (new record). PostInitialUpdater runs, then +value. Source eviction: −key. Dest eviction: −(key+value).
RMW ExpireAndStop CU OnDispose(Deleted) on source: −value. CAS: +key (tombstone at tail). Source sealed, eviction: −key. Tombstone eviction: −key.
CopyUpdate (ClearSource=true) CAS: +key. PCU runs, then +value. Internal CopyUpdated: −valueObject + Dispose. Source sealed, eviction: −key. Dest eviction: −(key+value).
CopyUpdate (ClearSource=false, checkpoint) CAS: +key. PCU runs, then +value. Source keeps value object, sealed. Source eviction: −(key+value). Dest eviction: −(key+value).
CopyUpdate (source in readonly) CAS: +key. PCU runs, then +value. Source eviction: −(key+value). Dest eviction: −(key+value).
Upsert (new record) CAS: +key. PostInitialWriter runs, then +value. Eviction: −(key+value).
Upsert CopyUpdate (from readonly) CAS: +key. PostInitialWriter runs, then +value. Source sealed, eviction: −(key+value). Dest eviction: −(key+value).
InPlaceUpdater (MainStore) ±delta (inline-guarded)
InPlaceUpdater (ObjectStore) ±delta (from Operate)
InPlaceUpdater (UnifiedStore) ±delta (from Operate, objects only)
InPlaceWriter ±delta (pre/post)
Elision after delete OnDispose(Deleted): −value. Freelist clears key. Invalid → eviction skips.
Revivification (to freelist) ClearHeapFields(clearKey=true). No tracker change (prior Deleted already did −value).
Revivification (from freelist) CAS: +key. PostInitialUpdater/Writer runs, then +value.
CAS failure ClearHeapFields frees memory. No tracker change (no prior +).
WrongType IPU No tombstone. No OnDispose. Record intact.
Page eviction (non-tombstoned) −CalculateHeapMemorySize() = −(key+value).
Page eviction (tombstoned) −key only (value already subtracted at delete site).
Recovery eviction Same as runtime eviction.
TryCopyToTail +(key+value) via UpdateSize. Eviction: −(key+value).
TryCopyToReadCache +(key+value) via UpdateSize. Eviction: −(key+value).
Disk read (pending IO) No tracker impact (transient record).
Scan (in-memory / disk) No tracker impact.
Checkpoint serialization No tracker impact (temporary clone).

@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 53389b6 to 8e8a852 Compare April 18, 2026 05:23
Tsavorite now handles ALL heap-size decrements internally via logSizeTracker:
- OnDispose(Deleted): decrements value heap before ClearHeapFields
- EvictRecordsInRange: decrements key overflow for ALL records (including
  tombstoned), decrements value heap for non-tombstoned records
- CopyUpdated: decrements value-object heap (already internalized earlier)

GarnetRecordTriggers.OnDispose and OnEvict are now no-ops for accounting.
CallOnEvict returns false. The trigger callbacks remain available for
app-level resource cleanup (e.g. IDisposable.Dispose on value objects
that hold external resources), but Garnet does not use them.

The heap-tracking contract is now cleanly split:
- Session functions: emit +value at creation sites (only the app knows
  the heap size of a newly created value)
- Tsavorite: emit -key and -value at all destruction/eviction sites
  (the record is in hand, Tsavorite can read the sizes directly)

EvictRecordsInRange is now called whenever logSizeTracker is set OR
CallOnEvict is true, ensuring internal accounting runs even when the
application opts out of the OnEvict trigger.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 8e8a852 to 3b96c8c Compare April 18, 2026 05:35
badrishc and others added 5 commits April 17, 2026 22:59
Key overflow was only tracked (+key) for internal TryCopyToTail/TryCopyToReadCache
copies via logSizeTracker.UpdateSize. Records created by session functions (RMW,
Upsert, Delete) never added +key, but EvictRecordsInRange subtracted -key at
eviction — causing the tracker to undercount for overflow-key records.

Fix: add logSizeTracker.IncrementSize(+key) at the CAS success site in
InternalRMW, InternalUpsert, and InternalDelete. This pairs with the -key
emitted by EvictRecordsInRange for all records (including tombstoned).

The heap-tracking responsibility is now cleanly split:
- Tsavorite: all key accounting (+key at CAS, -key at eviction) and all
  destruction-side value accounting (-value at OnDispose/eviction/CopyUpdated)
- Session functions: +value at creation sites only

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously, every ISessionFunctions implementation (MainStore, ObjectStore,
UnifiedStore, VectorStore) had to manually call cacheSizeTracker.AddHeapSize
in PostInitialUpdater, PostInitialWriter, and PostCopyUpdater. This was
error-prone — VectorSessionFunctions was missing these calls entirely,
causing heap tracking to go negative on delete (server crash).

Move all creation-site +value tracking into Tsavorite's CAS success sites
and ReinitializeExpiredRecord:
- InternalRMW: +value after PostInitialUpdater and PostCopyUpdater
- InternalUpsert: +value after PostInitialWriter
- ReinitializeExpiredRecord: +value after PostInitialUpdater (IPU path)

ISF implementations now only handle ±delta for in-place updates
(InPlaceUpdater/InPlaceWriter), where only the app knows the before/after
sizes. All other heap tracking is fully internal to Tsavorite.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetValueHeapMemorySize() for overflow values was returning
OverflowByteArray.TotalSize (Array.Length) while CalculateHeapMemorySize()
used OverflowByteArray.HeapMemorySize (Array.Length + 24 for .NET byte
array object overhead). This 24-byte-per-record mismatch caused the heap
tracker to go negative for overflow-value records (e.g. HLL dense) because
+value at creation used TotalSize but -value at eviction used HeapMemorySize.

Fix: change GetValueHeapMemorySize() to use HeapMemorySize, matching
CalculateHeapMemorySize(). Verified all other heap sizing paths
(KeyOverflow, ValueObject, CAS sites, eviction, OnDispose, IPU delta)
already use HeapMemorySize consistently.

Fixes HyperLogLogPFADD_LTM, HyperLogLogTestPFMERGE_LTM_DenseToDense,
and HyperLogLogTestPFMERGE_LTM_SparseToDense test failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GarnetRecordTriggers no longer does any heap-size accounting — all tracking
is internal to Tsavorite via logSizeTracker. Remove the cacheSizeTracker
field, constructor parameter, and all heap-accounting comments. The struct
is now a pure no-op placeholder for future app-level resource cleanup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Change CallOnEvict from method with EvictionSource parameter to a simple
  bool property. The per-source gating is no longer needed since Tsavorite
  handles all heap accounting internally. The EvictionSource is still passed
  to OnEvict for informational purposes.
- Update IRecordTriggers doc: remove heap-accounting implementation details
  from OnDispose/OnDisposeDiskRecord/OnEvict docs — these are now Tsavorite-
  internal. The trigger API is purely for app-level resource cleanup.
- Update InternalDelete/InternalRMW/SessionFunctionsWrapper comments to
  reference internal heap accounting instead of trigger-based accounting.
- Simplify RecordLifecycleTests CallOnEvict gating test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 644e0f2 to fcbd67a Compare April 18, 2026 17:40
…ia DIM

All three gate properties on IRecordTriggers now have default implementations
returning false. Implementations that don't need any trigger callbacks (like
GarnetRecordTriggers) no longer need to explicitly declare them — the empty
struct body suffices.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from fcbd67a to 290cfb5 Compare April 18, 2026 17:44
Add a ValueIsInline bit check before and after InPlaceUpdaterWorker to
avoid the expensive GetValueHeapMemorySize call (which chases pointers
through ObjectIdMap) on the common inline-value path. Only measure heap
when the value was or became non-inline (overflow/object transition).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc changed the title Replace SubscribeEvictions with IRecordTriggers Internalize heap-size tracking into Tsavorite via IRecordTriggers Apr 18, 2026
@badrishc badrishc changed the title Internalize heap-size tracking into Tsavorite via IRecordTriggers Internalize heap-size tracking into Tsavorite Apr 18, 2026
badrishc and others added 3 commits April 18, 2026 15:06
…ecovery

Two bugs found by GPT-5.4 review:

1. In-chain revivification (TryRevivifyInChain in InternalRMW and InternalUpsert)
   revived a tombstoned record without re-adding its value heap to the tracker.
   The value was subtracted at OnDispose(Deleted) when the record was deleted,
   but never re-added when the tombstone was reused for a new value. Fix: add
   GetValueHeapMemorySize() after PostInitialUpdater/Writer in both paths.

2. Recovery page-read used CalculateHeapMemorySize (which returns 0 for tombstones)
   but eviction later subtracts key overflow for tombstoned records. Fix: for
   tombstoned records during recovery, explicitly add KeyOverflow.HeapMemorySize
   instead of relying on CalculateHeapMemorySize.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…double-subtract

Two bugs found by Goldeneye review:

1. Elided and RevivificationFreeList dispose paths cleared heap fields
   (ClearHeapFields with clearKey=true) without subtracting from the tracker.
   Eviction skips invalid records, so the heap was never decremented.
   Fix: OnDispose now subtracts CalculateHeapMemorySize for Elided and
   RevivificationFreeList reasons (with tombstone-aware key-only fallback).

2. SubscribeEvictions(LogSizeTracker) sets both onEvictionObserver AND
   logSizeTracker to the same object. OnPagesClosedWorker and
   EvictPageForRecovery then run both MemoryPageScan (observer.OnNext)
   and EvictRecordsInRange (logSizeTracker), double-subtracting.
   Fix: skip MemoryPageScan when onEvictionObserver IS logSizeTracker.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tStore IPU

ObjectStore InPlaceUpdaterWorker's HasRemoveKey path (e.g. SPOP/LPOP/HDEL/ZREM
removing the last element) returned false with ExpireAndStop without applying
the sizeChange from Operate. The mutation had already happened in-place
(HeapMemorySize updated on the object), but the tracker never saw the negative
delta. OnDispose(Deleted) then subtracted only the post-removal empty-collection
size, leaving the pre-removal payload permanently over-counted.

Fix: apply sizeChange to cacheSizeTracker before the early return so the
tracker reflects the in-place mutation before ExpireAndStop disposes the record.

Found by Opus 4.7 review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@badrishc badrishc force-pushed the badrishc/replace-subscribe-eviction branch from 4046140 to 285bf22 Compare April 18, 2026 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants