fix: release 1.0.3 - security hardening and quality improvements by Romain-Grosos · Pull Request #20 · Rwx-G/Buncker

Romain-Grosos · 2026-03-12T19:09:03Z

Summary

Security hardening: narrowed exception types (crypto, tar extraction), SHA256 pre-verification, TOCTOU fix on analysis cache, OCI per-IP rate limiting (200 req/min)
Reliability: graceful shutdown with ThreadPoolExecutor drain, socket timeout (60s) for slowloris mitigation, GC pre-delete audit log
Code quality: dead code cleanup, deduplicated resolver logic (-52 lines), removed duplicate imports, config validation for gc/transfer_path/unknown keys
Tests: 12 new tests (OCI rate limit, Content-Length, blob streaming 5 MiB, slowloris, config validation, log limits)
Docs: CHANGELOG, README install examples, RPM specs updated to 1.0.3

Test plan

597 tests pass, 9 skipped (platform-dependent)
ruff check . clean
ruff format --check . clean
All security fixes covered by dedicated tests

_last_gc_report was read/written without synchronization across concurrent request threads. Add _gc_lock to protect gc_report() writes and gc_execute() reads, consistent with _analysis_lock pattern in BunckerServer.

Switch from BaseHTTPRequestHandler to a WSGI-based architecture using waitress for production-grade HTTP serving. Waitress provides proper connection management, request parsing, and thread pooling out of the box. For TLS mode, fall back to stdlib WSGIServer since waitress does not support SSL in its async I/O model. - Convert BunckerHandler to standalone WSGI-compatible class - Add _WSGIHeaders adapter for environ-based header access - Add _ResponseWriter for response body buffering - Add create_wsgi_app() factory function - Use waitress.create_server() for non-TLS (common case) - Use ThreadingMixIn + WSGIServer for TLS fallback - Fix reserved LogRecord attribute conflict (filename -> manifest_filename) - Fix oversized body test to work with waitress request parsing - Use case-insensitive HTTPMessage headers in test helpers

Use OCIPlatform from shared.oci to properly parse platform strings with os/arch/variant format (e.g. linux/arm/v7). Previously only os/arch was parsed, silently ignoring the variant component.

Replace waitress with stdlib WSGIServer for both TLS and non-TLS modes. Waitress cannot serve TLS (async I/O incompatible with SSL), making it unsuitable for the LAN use case that actually needs a production-grade server. - Stream blob responses via WSGI iterator (no full memory buffering) - Increase chunk size from 64 KiB to 1 MiB for better throughput - Enable TCP_NODELAY on accepted connections - Set TCP backlog to 32 for predictable queuing - Remove python3-waitress dependency

- Fix FD leak in Store.import_blob error path - Validate Content-Length as integer in 3 handler locations - Bounds-check log limit parameter (0-10000) - Reject symlinks/hardlinks in tar extraction (Python <3.12) - Handle HMAC decode errors as TransferError - SHA256 pre-verification before streaming blob response - Fix TOCTOU on analysis cache (lock covers id comparison) - Add per-IP OCI rate limiting (200 req/min sliding window)

- Update install examples from 1.0.1 to 1.0.3 - Document OCI rate limiting (200 req/min) in security section - Bump RPM spec versions from 1.0.1 to 1.0.3

- crypto.py: catch (ValueError, InvalidTag) instead of bare Exception - transfer.py: catch CryptoError instead of bare Exception on decrypt - transfer.py: match tarfile exception class names instead of string matching on error messages for Python 3.12+ filter errors

- server: pool.shutdown(wait=True, cancel_futures=True) drains in-flight requests before closing; join timeout raised to 5s - store: gc_pre_delete audit log emitted before any blob deletion with full digest list for forensic recovery - store: use contextlib.suppress for OSError in FD cleanup

- Validate gc.inactive_days_threshold >= 1 - Validate transfer_path is a string when set - Warn on unknown config keys to catch typos - Add tests for all new validation rules

- OCI rate limiter returns 429 with TOOMANYREQUESTS on manifests/blobs - OCI rate limiter does not affect /v2/ root endpoint - Non-integer Content-Length returns 400 instead of 500 - 4 MiB blob streaming test verifies multi-chunk transfer - Log limit bounds tests: negative and excessive values return 400

- Add graceful shutdown, GC audit log, narrowed exceptions - Add config validation improvements - Document all security and quality enhancements

- Log server_stopping with pending worker count before shutdown - Move socket timeout=60 to _QuietWSGIHandler (was dead code on BunckerHandler after WSGI refactor) - Fix CryptoError double-wrap in decrypt_env_value (use from None) - Add slowloris timeout test (2s patched, validates connection close and server recovery) - Add 5 MiB blob streaming test (multi-chunk SHA256 integrity)

- __main__.py: remove duplicate import contextlib in _cmd_setup - __main__.py: remove duplicate import shutil in _cmd_api_setup - resolver.py: resolve_dockerfile now delegates to _resolve_image_blobs instead of duplicating 60 lines of identical blob resolution logic

Romain-Grosos added 19 commits March 12, 2026 14:42

fix(store): add threading lock for GC report state

270af5d

_last_gc_report was read/written without synchronization across concurrent request threads. Add _gc_lock to protect gc_report() writes and gc_execute() reads, consistent with _analysis_lock pattern in BunckerServer.

fix(fetch): support multi-arch platform variant in manifest resolution

a9497a4

Use OCIPlatform from shared.oci to properly parse platform strings with os/arch/variant format (e.g. linux/arm/v7). Previously only os/arch was parsed, silently ignoring the variant component.

chore: bump version to 1.0.3

0aaf870

style: fix lint errors (line length, contextlib.suppress)

f058d23

fix(test): add python3-waitress to deb-install test Dockerfile

34d04e1

docs: update README and RPM specs for v1.0.3

93a5eab

- Update install examples from 1.0.1 to 1.0.3 - Document OCI rate limiting (200 req/min) in security section - Bump RPM spec versions from 1.0.1 to 1.0.3

fix(config): add gc threshold, transfer_path and unknown key validation

9556336

- Validate gc.inactive_days_threshold >= 1 - Validate transfer_path is a string when set - Warn on unknown config keys to catch typos - Add tests for all new validation rules

docs: update changelog with all v1.0.3 hardening changes

4857b5c

- Add graceful shutdown, GC audit log, narrowed exceptions - Add config validation improvements - Document all security and quality enhancements

docs: update changelog with socket timeout and crypto fixes

95dc949

docs: add resolver refactor and shutil fix to changelog

a3d5d62

chore: apply ruff format to 5 files

cfefa9a

Romain-Grosos added this to the v1.0.3 milestone Mar 12, 2026

Romain-Grosos self-assigned this Mar 12, 2026

Romain-Grosos merged commit f89d608 into main Mar 12, 2026
7 checks passed

Romain-Grosos deleted the fix/release-1.0.3 branch March 12, 2026 19:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: release 1.0.3 - security hardening and quality improvements#20

fix: release 1.0.3 - security hardening and quality improvements#20
Romain-Grosos merged 19 commits intomainfrom
fix/release-1.0.3

Romain-Grosos commented Mar 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Romain-Grosos commented Mar 12, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant