feat: add PINNED_MODELS and PRELOAD_API_KEY for preload on serverless by hansent · Pull Request #2048 · roboflow/inference

hansent · 2026-02-27T14:44:10Z

Summary

PRELOAD_API_KEY: Dedicated API key for model preloading. On user-facing deployments, setting API_KEY globally causes unintended side effects (fallback auth on unauthenticated requests, billing attribution, model-access changes). PRELOAD_API_KEY provides the credential needed for model download during startup without affecting per-request behaviour. Falls back to API_KEY when not set.
PINNED_MODELS: Comma-separated list of model IDs that are always preloaded at startup (bypassing the LAMBDA/GCP_SERVERLESS gate) and pinned in the LRU cache so they are never evicted under size or memory pressure. This replaces the need for a separate FORCE_PRELOAD flag.
Improved preload logging: Load timing and resolved model IDs logged for each preloaded model.
Direct model_manager.add_model() call: The model_add route handler is defined inside if not (LAMBDA or GCP_SERVERLESS), so it doesn't exist on serverless deployments. Preload now calls the underlying method directly.

New env vars

Variable	Values	Default
`PRELOAD_API_KEY`	Roboflow API key for preloading	Falls back to `API_KEY`
`PINNED_MODELS`	Comma-separated model IDs	unset

Example usage

# User-facing deployment on GCP with GCP_SERVERLESS=True:
PINNED_MODELS=sam2/hiera_large,sam3/sam3_final,sam3/sam3_interactive
PRELOAD_API_KEY=rf_your_api_key

Files changed

inference/core/env.py — new PRELOAD_API_KEY and PINNED_MODELS env vars
inference/core/interfaces/http/http_api.py — preload gate uses PRELOAD_API_KEY and PINNED_MODELS, calls model_manager.add_model() directly, improved logging
inference/core/managers/decorators/fixed_size_cache.py — pin_model() method + eviction skip logic

Test plan

Verify server starts normally without new env vars set (backward compat)
Set PINNED_MODELS + PRELOAD_API_KEY with GCP_SERVERLESS=True, verify preloading works
Verify logs show model loading with timing
Verify /model/registry shows preloaded model after startup
Load enough models to exceed MAX_ACTIVE_MODELS and verify pinned models are NOT evicted
Verify PRELOAD_MODELS still works with API_KEY on non-serverless (backward compat)
Verify per-request auth is unaffected when only PRELOAD_API_KEY is set

🤖 Generated with Claude Code

- PRELOAD_API_KEY: dedicated API key for model preloading so the global API_KEY doesn't need to be set on user-facing deployments (falls back to API_KEY when not set) - PINNED_MODELS: comma-separated model IDs that are always preloaded at startup (bypassing the LAMBDA/GCP_SERVERLESS gate) and pinned in the LRU cache so they are never evicted by size or memory pressure limits - Improved preload logging with timing and resolved model IDs - Call self.model_manager.add_model() directly instead of the model_add route handler (which doesn't exist when GCP_SERVERLESS=True) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

hansent requested review from PawelPeczek-Roboflow, dkosowski87, grzegorz-roboflow, probicheaux and yeldarby as code owners February 27, 2026 14:44

hansent mentioned this pull request Feb 27, 2026

feat: preflight warmup, eviction pinning, and dedicated preload API key #2045

Closed

10 tasks

make style

a13c68c

PawelPeczek-Roboflow approved these changes Feb 27, 2026

View reviewed changes

hansent merged commit c692969 into main Feb 27, 2026
51 of 57 checks passed

hansent deleted the feat/preload-pinned-models branch February 27, 2026 17:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add PINNED_MODELS and PRELOAD_API_KEY for preload on serverless#2048

feat: add PINNED_MODELS and PRELOAD_API_KEY for preload on serverless#2048
hansent merged 2 commits intomainfrom
feat/preload-pinned-models

hansent commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hansent commented Feb 27, 2026

Summary

New env vars

Example usage

Files changed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants