Skip to content

Add configurable perf_record_delay for profiling during benchmarks#523

Open
charles-typ wants to merge 2 commits intofacebookresearch:v2-betafrom
charles-typ:export-D96231930-to-v2-beta
Open

Add configurable perf_record_delay for profiling during benchmarks#523
charles-typ wants to merge 2 commits intofacebookresearch:v2-betafrom
charles-typ:export-D96231930-to-v2-beta

Conversation

@charles-typ
Copy link
Contributor

Summary:
Add support for a configurable delay before starting perf record during
server benchmarks. This allows profiling to capture steady-state behavior
after client warmup completes.

Changes:

  • Add perf_record_delay parameter to benchmark configs (ALLOWED_PARAMS)
  • Auto-compute server's perf_record_delay from client's warmup_seconds
    when DCPERF_PERF_RECORD is enabled (warmup + 60s buffer)
  • Add --perf-record-delay CLI argument to server with 120s default
  • Implement profile_server() function with threading.Timer for delayed
    perf record start (5 second system-wide profile)
  • Update jobs_internal.yml with perf_record_delay in ucache_bench jobs

Differential Revision: D96231930

Yupeng Tang and others added 2 commits March 11, 2026 15:56
Summary:
The cachelib_num_shards parameter was parsed from gflags and stored in
UcacheBenchConfig but never actually applied to the CacheAllocator::Config.
This meant the config value was silently ignored and CacheLib used its
default of 8192 shards.

Now call setNumShards() when cachelib_num_shards > 0, allowing the
benchmark to match production shard counts for more accurate CPU
utilization profiling.

Differential Revision: D96087814
Summary:
Add support for a configurable delay before starting perf record during
server benchmarks. This allows profiling to capture steady-state behavior
after client warmup completes.

Changes:
- Add perf_record_delay parameter to benchmark configs (ALLOWED_PARAMS)
- Auto-compute server's perf_record_delay from client's warmup_seconds
  when DCPERF_PERF_RECORD is enabled (warmup + 60s buffer)
- Add --perf-record-delay CLI argument to server with 120s default
- Implement profile_server() function with threading.Timer for delayed
  perf record start (5 second system-wide profile)
- Update jobs_internal.yml with perf_record_delay in ucache_bench jobs

Differential Revision: D96231930
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 12, 2026
@meta-codesync
Copy link

meta-codesync bot commented Mar 12, 2026

@charles-typ has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96231930.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant