Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
09ab3e1
feat: add download acceleration infrastructure
deanq Aug 16, 2025
795c9e5
feat: integrate download acceleration with dependency installer
deanq Aug 16, 2025
046eb58
feat: add workspace acceleration support
deanq Aug 16, 2025
45a65fe
test: add download acceleration test coverage
deanq Aug 16, 2025
ce51390
chore: moved test-handler files to src/
deanq Aug 16, 2025
6c04de1
feat: runtime uses aria2 for accelerated parallel downloads
deanq Aug 16, 2025
66eb286
chore: update project structure and dependencies
deanq Aug 16, 2025
1930b4b
chore: updated tetra-rp
deanq Aug 19, 2025
731fd56
build: local-execution-test use make test-handler
deanq Aug 19, 2025
e829140
chore: update CLAUDE.md
deanq Aug 19, 2025
104b2da
chore: move these values to constants.py for maintainability
deanq Aug 19, 2025
f8aa89a
feat: add system package acceleration with nala
deanq Aug 19, 2025
cd56185
refactor: disable Python package download acceleration
deanq Aug 20, 2025
d7c996d
test: uv is no longer part of download accelerator
deanq Aug 20, 2025
2ab93e3
feat: implement accelerate_downloads parameter logic in RemoteExecutor
deanq Aug 21, 2025
b50a7bf
feat: add pip fallback for Python dependencies when acceleration disa…
deanq Aug 21, 2025
440d00d
feat: enhance HF model caching with hf_transfer/hf_xet strategy
deanq Aug 21, 2025
0320e4d
test: add comprehensive coverage for accelerate_downloads parameter
deanq Aug 21, 2025
034f770
test: update integration tests for new acceleration parameter
deanq Aug 21, 2025
9531079
chore: update dependencies and constants for download acceleration
deanq Aug 21, 2025
d75d320
refactor: remove pip installation method from dependency installer
deanq Aug 21, 2025
227b33e
test: update unit tests to expect UV instead of pip
deanq Aug 21, 2025
338a165
test: rename test file from pip to UV naming convention
deanq Aug 21, 2025
f88745d
feat: implement parallel execution for accelerated downloads
deanq Aug 21, 2025
f22e74d
feat: add async wrapper for HuggingFace model download acceleration
deanq Aug 21, 2025
816fc75
test: update tests for parallel execution and async dependencies
deanq Aug 21, 2025
c9ad0d3
test: comprehensive test coverage expansion and cleanup
deanq Aug 21, 2025
e31137a
refactor: optimize HF acceleration to use native Hub features
deanq Aug 21, 2025
e1db417
chore: memory correction
deanq Aug 21, 2025
76ab9c0
feat: implement HuggingFace download acceleration strategies
deanq Aug 21, 2025
c269bcd
feat: implement centralized log streaming system
deanq Aug 21, 2025
04e5b54
chore: these logs are for debug level
deanq Aug 21, 2025
f1db33a
chore: specs for Endpoint Persistence using Network Volume and CDR
deanq Aug 22, 2025
bb27ae3
chore: local setup
deanq Aug 22, 2025
f232a9c
Merge branch 'main' into deanq/ae-962-log-streaming
deanq Aug 29, 2025
83b3293
fix: merge conflicts
deanq Aug 29, 2025
f9a068c
chore: make update
deanq Aug 29, 2025
692d8fc
chore: these are debug lines
deanq Aug 29, 2025
ce2deae
chore: nala logs were too noisy
deanq Aug 29, 2025
90e3b9a
chore: updated tetra-rp
deanq Aug 29, 2025
a2f67d3
Merge branch 'deanq/ae-962-log-streaming' into deanq/ae-1092-tetra-vo…
deanq Aug 29, 2025
e578007
fix: duplicated lines due to bad merge conflict resolution
deanq Aug 29, 2025
7e7201e
Merge branch 'main' into deanq/ae-962-log-streaming
deanq Sep 14, 2025
8091476
chore: update uv.lock
deanq Sep 14, 2025
8c8f902
fix: test-handler was not exactly testing properly
deanq Sep 15, 2025
47699d2
Merge branch 'deanq/ae-962-log-streaming' into deanq/ae-1165-bug-pyto…
deanq Sep 17, 2025
c6ac06d
chore: use GPU build for smoke tests
deanq Sep 17, 2025
cc92323
fix: multi-stage build loses crucial built-in system Python
deanq Sep 17, 2025
83787e7
chore: updated to latest submodule state
deanq Sep 17, 2025
b99af9d
fix: local/macos testing fails due to lack of apt-get or nala
deanq Sep 17, 2025
736c2eb
test: tests to confirm system Python access
deanq Sep 17, 2025
0785310
refactor: install_dependencies relies on uv and pip provisions
deanq Sep 17, 2025
16f7a1b
chore: better debug logs for dependency_installer
deanq Sep 17, 2025
ee6595e
feat: implement universal subprocess utility with automatic logging
deanq Sep 17, 2025
8c259c2
fix: use Docker detection for dependency installation method
deanq Sep 18, 2025
fcd51bd
docs: update CLAUDE.md with universal subprocess utility documentation
deanq Sep 18, 2025
b980e51
chore: vscode config to point to src
deanq Sep 18, 2025
0bda8a1
chore: no need for a NALA_CHECK_CMD constant
deanq Sep 18, 2025
177c3d2
build: update submodule
deanq Sep 18, 2025
d365f2a
chore: logs namespace is now just `tetra`
deanq Sep 18, 2025
8b2cf5f
Merge branch 'deanq/ae-962-log-streaming' into deanq/ae-1165-bug-pyto…
deanq Sep 18, 2025
b95451d
docs: System Python Runtime Architecture
deanq Sep 18, 2025
011890a
docs: Centralized Log Streaming System
deanq Sep 18, 2025
d1a6fa2
Merge branch 'main' into deanq/ae-1092-tetra-volume-warm-cache
deanq Sep 18, 2025
22d80cf
chore: update submodule
deanq Sep 18, 2025
55eac77
Merge branch 'deanq/ae-962-log-streaming' into deanq/ae-1092-tetra-vo…
deanq Sep 18, 2025
a277bc0
Merge branch 'deanq/ae-1165-bug-pytorchs-not-found' into deanq/ae-109…
deanq Sep 18, 2025
f11b4dc
chore: update and cleanup
deanq Sep 22, 2025
e2fe764
fix: docker uses system's conda python; local uses uv
deanq Sep 22, 2025
c1c95c8
refactor: configurable `NAMESPACE` for logs; default: tetra
deanq Sep 22, 2025
df70047
refactor: no longer need to setup python paths in workspace
deanq Sep 22, 2025
d1a832c
Merge branch 'deanq/ae-1165-bug-pytorchs-not-found' into deanq/ae-109…
deanq Sep 22, 2025
9ddb6d8
refactor: reorganize test_*.json files into src/tests/ directory
deanq Sep 23, 2025
d4e1366
Merge branch 'main' into deanq/ae-1092-tetra-volume-warm-cache
deanq Sep 26, 2025
0a61a98
fix: incorrect merge
deanq Sep 26, 2025
90ebc63
chore: update version from release
deanq Sep 26, 2025
91d621b
chore: these tests were moved to src/tests/
deanq Sep 26, 2025
1c4b046
Merge branch 'deanq/ae-1092-tetra-volume-warm-cache' of https://githu…
deanq Sep 27, 2025
19498ff
build: consolidated local-execution-test and docker-pr into docker-test
deanq Sep 28, 2025
b0b3ec7
build: make sure all of src/ is copied
deanq Sep 28, 2025
168e4ef
refactor: removed the use of network volume as runtime workspaces
deanq Sep 30, 2025
54fe320
refactor: remove workspace_manager dependency from all modules
deanq Sep 30, 2025
a01ae7e
refactor: huggingface_accelerator -> huggingface_cache
deanq Oct 1, 2025
e95afb5
refactor: completely removed workspace manager
deanq Oct 1, 2025
7a4093d
docs: updated docs with the latest refactors
deanq Oct 1, 2025
7880b2b
fix: set non-error debug log when cached for the first time
deanq Oct 1, 2025
4132da8
fix: catch CacheNotFound instead
deanq Oct 1, 2025
7d9e82a
refactor: remove log acceleration summary
deanq Oct 1, 2025
2b1642e
fix: make sure uv uses caching for downloaded artifacts
deanq Oct 2, 2025
b28de41
feat: cache sync manager.sync_to_volume_async to tarball cache changes
deanq Oct 3, 2025
be75698
chore: rename sync_to_volume_async to sync_to_volume
deanq Oct 3, 2025
86f6063
feat: cache sync manager. hydrate_from_volume to hydrate container ca…
deanq Oct 3, 2025
ab63b88
fix: huggingface cache was only recognizing main branch
deanq Oct 3, 2025
01e060f
refactor: cleanup, optimization, and simplified after many observations
deanq Oct 4, 2025
13f86d5
refactor: deprecate hf_models_to_cache
deanq Oct 6, 2025
6a5d87d
fix: deleted duplicate code block
deanq Oct 6, 2025
c191b02
chore: simplified happy path return
deanq Oct 6, 2025
9eb3875
Merge branch 'deanq/ae-1092-tetra-volume-warm-cache' into deanq/ae-10…
deanq Oct 6, 2025
7e26412
docs: updated the docstring to reflect function's intent
deanq Oct 6, 2025
a74fa09
fix: result.error not .stdout
deanq Oct 6, 2025
3bfbbdf
Merge branch 'deanq/ae-1092-tetra-volume-warm-cache' into deanq/ae-12…
deanq Oct 6, 2025
8086436
Merge branch 'deanq/ae-1092-volume-cache-sync' into deanq/ae-1268-dep…
deanq Oct 6, 2025
e2d2608
Merge branch 'main' into deanq/ae-1092-volume-cache-sync
deanq Oct 6, 2025
239d96b
fix: bad merge
deanq Oct 6, 2025
580b9f3
refactor: HuggingFace cache location set outside `/root/.cache` to ex…
deanq Oct 8, 2025
a26e228
Merge branch 'deanq/ae-1092-volume-cache-sync' into deanq/ae-1268-dep…
deanq Oct 8, 2025
18e3a53
build: tetra-rp submodule should be pinned
deanq Oct 9, 2025
85b5e4a
Merge branch 'deanq/ae-1092-volume-cache-sync' into deanq/ae-1268-dep…
deanq Oct 9, 2025
f80f8cd
Merge branch 'main' into deanq/ae-1268-deprecate-hf_models_to_cache
deanq Oct 10, 2025
57d73e9
chore: make update to update protocols
deanq Oct 10, 2025
01d4372
Merge branch 'main' into deanq/ae-1268-deprecate-hf_models_to_cache
deanq Oct 10, 2025
14b3455
build: make update
deanq Oct 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 13 additions & 24 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,56 +17,49 @@ This is `worker-tetra`, a RunPod Serverless worker template that provides dynami
- **Function Executor** (`src/function_executor.py:12`): Handles individual function execution with full output capture (stdout, stderr, logs)
- **Class Executor** (`src/class_executor.py:14`): Manages class instantiation and method execution with instance persistence and metadata tracking

### 2. HuggingFace Model Cache-Ahead (`src/huggingface_cache.py`)
- **Model Pre-Caching**: Downloads HuggingFace models before user code execution
- **Cache Validation**: Checks if models are already cached to avoid redundant downloads
- **Authentication**: Supports HF_TOKEN for private/gated model access
- **Transfer Acceleration**: Uses hf_transfer when HF_HUB_ENABLE_HF_TRANSFER=1 is set
- **Transparent Caching**: User code references models without knowing they're pre-cached

### 3. Dependency Management System (`src/dependency_installer.py:14`)
### 2. Dependency Management System (`src/dependency_installer.py:14`)
- **Python Package Installation**: UV-based package management with environment-aware configuration (Docker vs local)
- **System Package Installation**: APT/Nala-based system dependency handling with acceleration support
- **Differential Installation**: Optimized package installation that skips already-installed packages
- **Environment Detection**: Automatic Docker vs local environment detection for appropriate installation methods
- **System Package Filtering**: Intelligent detection of system-available packages to avoid redundant installation
- **Universal Subprocess Integration**: All subprocess operations use centralized logging utility

### 4. Universal Subprocess Utility (`src/subprocess_utils.py`)
### 3. Universal Subprocess Utility (`src/subprocess_utils.py`)
- **Centralized Subprocess Operations**: All subprocess calls use `run_logged_subprocess` for consistency
- **Automatic Logging Integration**: All subprocess output flows through log streamer at DEBUG level
- **Environment-Aware Execution**: Handles Docker vs local environment differences automatically
- **Standardized Error Handling**: Consistent FunctionResponse pattern for all subprocess operations
- **Timeout Management**: Configurable timeouts with proper cleanup on timeout/cancellation

### 5. Serialization & Protocol Management
### 4. Serialization & Protocol Management
- **Protocol Definitions** (`src/remote_execution.py:13`): Pydantic models for request/response with validation
- **Serialization Utils** (`src/serialization_utils.py`): CloudPickle-based data serialization for function arguments and results
- **Base Executor** (`src/base_executor.py`): Common execution interface and environment setup

### 6. Tetra SDK Integration (`tetra-rp/` submodule)
### 5. Tetra SDK Integration (`tetra-rp/` submodule)
- **Client Interface**: `@remote` decorator for marking functions for remote execution
- **Resource Management**: GPU/CPU configuration and provisioning through LiveServerless objects
- **Live Serverless**: Dynamic infrastructure provisioning with auto-scaling
- **Protocol Buffers**: Communication protocol definitions for distributed execution

### 7. Testing Infrastructure (`tests/`)
### 6. Testing Infrastructure (`tests/`)
- **Unit Tests** (`tests/unit/`): Component-level testing for individual modules with mocking
- **Integration Tests** (`tests/integration/`): End-to-end workflow testing with real execution
- **Test Fixtures** (`tests/conftest.py:1`): Shared test data, mock objects, and utility functions
- **Handler Testing**: Local execution validation with JSON test files (`src/tests/`)
- **Full Coverage**: All handler tests pass with environment-aware dependency installation
- **Cross-Platform**: Works correctly in both Docker containers and local macOS/Linux environments

### 8. Build & Deployment Pipeline
### 7. Build & Deployment Pipeline
- **Docker Containerization**: GPU (`Dockerfile`) and CPU (`Dockerfile-cpu`) image builds
- **CI/CD Pipeline**: Automated testing, linting, and releases (`.github/workflows/`)
- **Quality Gates** (`Makefile:104`): Format checking, type checking, test coverage requirements
- **Release Management**: Automated semantic versioning and Docker Hub deployment

### 9. Configuration & Constants
### 8. Configuration & Constants
- **Constants** (`src/constants.py`): System-wide configuration values (NAMESPACE, LARGE_SYSTEM_PACKAGES)
- **Environment Configuration**: RunPod API integration and HuggingFace cache settings
- **Environment Configuration**: RunPod API integration

## Architecture

Expand All @@ -90,13 +83,12 @@ This is `worker-tetra`, a RunPod Serverless worker template that provides dynami
### Key Patterns

1. **Remote Function Execution**: Functions decorated with `@remote` are automatically executed on RunPod GPU workers
2. **Composition Pattern**: RemoteExecutor uses specialized components (DependencyInstaller, HuggingFaceCacheAhead, Executors)
2. **Composition Pattern**: RemoteExecutor uses specialized components (DependencyInstaller, Executors)
3. **Dynamic Dependency Management**: Dependencies specified in decorators are installed at runtime with differential updates
4. **HuggingFace Cache-Ahead**: Models specified in `hf_models_to_cache` are pre-downloaded before execution
5. **Universal Subprocess Operations**: All subprocess calls use centralized `run_logged_subprocess` for consistent logging and error handling
6. **Environment-Aware Configuration**: Automatic Docker vs local environment detection for appropriate installation methods
7. **Serialization**: Uses cloudpickle + base64 encoding for function arguments and results
8. **Resource Configuration**: `LiveServerless` objects define GPU requirements, scaling, and worker configuration
4. **Universal Subprocess Operations**: All subprocess calls use centralized `run_logged_subprocess` for consistent logging and error handling
5. **Environment-Aware Configuration**: Automatic Docker vs local environment detection for appropriate installation methods
6. **Serialization**: Uses cloudpickle + base64 encoding for function arguments and results
7. **Resource Configuration**: `LiveServerless` objects define GPU requirements, scaling, and worker configuration

## Development Commands

Expand Down Expand Up @@ -148,8 +140,6 @@ git submodule update --remote --rebase # Update tetra-rp to latest
- `HF_TOKEN`: Optional authentication token for private/gated HuggingFace models
- `HF_HOME=/hf-cache`: HuggingFace cache location, set outside `/root/.cache` to exclude from volume sync
- `DEBIAN_FRONTEND=noninteractive`: Set during system package installation
- `UV_CACHE_DIR`: Package cache configuration
- `VIRTUAL_ENV`: Virtual environment path configuration

### Resource Configuration
Configure GPU resources using `LiveServerless` objects:
Expand Down Expand Up @@ -205,7 +195,6 @@ gpu_config = LiveServerless(
│ ├── function_executor.py # Function execution with output capture
│ ├── class_executor.py # Class execution with persistence
│ ├── dependency_installer.py # Python and system dependency management
│ ├── huggingface_cache.py # HuggingFace model cache-ahead system
│ ├── serialization_utils.py # CloudPickle serialization utilities
│ ├── base_executor.py # Common execution interface
│ ├── constants.py # System-wide configuration constants
Expand Down
8 changes: 7 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ update: # Upgrade all dependencies
uv sync --upgrade --all-groups
uv lock --upgrade
git submodule update --remote
make protocols

clean: # Remove build artifacts and cache files
rm -rf dist build *.egg-info
Expand All @@ -32,7 +33,12 @@ clean: # Remove build artifacts and cache files
find . -type f -name "*.pkl" -delete

setup: dev # Initialize project, sync deps, update submodules
git submodule update --init --recursive
@if [ ! -f "tetra-rp/.git" ]; then \
git submodule update --init --recursive; \
fi
make protocols

protocols: # Copy remote_execution protocol from submodule
cp tetra-rp/src/tetra_rp/protos/remote_execution.py src/

build: # Build both GPU and CPU Docker images
Expand Down
4 changes: 2 additions & 2 deletions docs/Endpoint Persistence.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

- First container boots, and checks for volume presence and endpoint workspace. Create if not found.

1. Container will proceed to download any system, python or HF pre-cache instructed from the remote decorator.
1. Container will proceed to download any system or python dependencies in parallel as instructed from the remote decorator.

2. Container runs its job.

Expand Down Expand Up @@ -45,7 +45,7 @@ graph TD
H -->|No| J[Launch CDR Daemon<br/>Hydrate /app ← Workspace<br/>Then Monitor /app →
Workspace]

G --> K[Download Dependencies<br/>System + Python + HF]
G --> K[Download Dependencies<br/>System + Python]
I --> K
J --> L[Skip Downloads<br/>Use Cached Data]

Expand Down
126 changes: 0 additions & 126 deletions src/huggingface_cache.py

This file was deleted.

24 changes: 1 addition & 23 deletions src/remote_executor.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import logging
import asyncio
from typing import List, Any
from huggingface_cache import HuggingFaceCacheAhead
from remote_execution import FunctionRequest, FunctionResponse, RemoteExecutorStub
from dependency_installer import DependencyInstaller
from function_executor import FunctionExecutor
Expand All @@ -25,7 +24,6 @@ def __init__(self):
self.dependency_installer = DependencyInstaller()
self.function_executor = FunctionExecutor()
self.class_executor = ClassExecutor()
self.hf_cache = HuggingFaceCacheAhead()
self.cache_sync = CacheSyncManager()

async def ExecuteFunction(self, request: FunctionRequest) -> FunctionResponse:
Expand Down Expand Up @@ -54,11 +52,7 @@ async def ExecuteFunction(self, request: FunctionRequest) -> FunctionResponse:

try:
# Hydrate cache from volume if needed (before any installations)
has_installations = (
request.dependencies
or request.system_dependencies
or request.hf_models_to_cache
)
has_installations = request.dependencies or request.system_dependencies
if has_installations:
await self.cache_sync.hydrate_from_volume()

Expand Down Expand Up @@ -148,13 +142,6 @@ async def _install_dependencies_parallel(
tasks.append(task)
task_names.append("python_dependencies")

# Add HF model cache-ahead tasks
if request.hf_models_to_cache:
for model_id in request.hf_models_to_cache:
task = self.hf_cache.cache_model_download_async(model_id)
tasks.append(task)
task_names.append(f"hf_model_{model_id}")

if not tasks:
return FunctionResponse(success=True, stdout="No dependencies to install")

Expand Down Expand Up @@ -189,15 +176,6 @@ async def _install_dependencies_sequential(
return sys_installed
self.logger.info(sys_installed.stdout)

# Cache-ahead HuggingFace models if requested (should not happen when acceleration disabled)
if request.accelerate_downloads and request.hf_models_to_cache:
for model_id in request.hf_models_to_cache:
cache_result = self.hf_cache.cache_model_download(model_id)
if cache_result.success:
self.logger.info(cache_result.stdout)
else:
self.logger.warning(cache_result.error)

# Install Python dependencies next
if request.dependencies:
py_installed = self.dependency_installer.install_dependencies(
Expand Down
11 changes: 0 additions & 11 deletions src/tests/test_hf_accelerated_input.json

This file was deleted.

Loading
Loading