Add double-flush reset-dictionary feature. by BrianPugh · Pull Request #300 · BrianPugh/tamp

BrianPugh · 2026-03-25T16:27:06Z

Summary

Add dictionary reset via double-FLUSH token sequence, enabling append-to-stream without retaining prior compressor state. Both sides re-initialize the window and continue with a fresh dictionary.
Uses the previously reserved more_header header bit to signal dictionary-reset-capable streams. Old decompressors (<2.1.0) reject these streams at the header, preventing silent corruption.
Add conf.append mode: writes a FLUSH instead of a header, allowing a new compressor to resume an existing stream after reboot.
Implemented across all bindings: C, Python, Cython, MicroPython, and JavaScript/WASM.

Protocol

more_header (header byte 1, bit 0) now implies dictionary_reset
Header byte 2 added (all bits reserved, must be zero)
Two consecutive FLUSH tokens signal dictionary re-initialization
FLUSH is always emitted (even when byte-aligned) in more_header streams to support append detection

API

TampConf.dictionary_reset / TampConf.append config fields
tamp_compressor_reset_dictionary() (C) / Compressor.reset_dictionary() (Python/JS)
Decompressor automatically handles double-FLUSH when more_header is set

…ction

Copilot

Pull request overview

This PR introduces a “double-FLUSH” dictionary reset mechanism (and an append mode built on it) so new compressed segments can be appended to an existing stream without persisting compressor state, with seamless handling by updated decompressors across implementations.

Changes:

Add dictionary_reset + append configuration flags, plus compressor-side APIs to emit the double-FLUSH reset signal and reinitialize state.
Update decompressors (C + Python + WASM + Cython) to support 2-byte headers (more_header + reserved byte) and to detect double-FLUSH for dictionary reinit.
Add extensive cross-implementation test coverage and update the specification + language/library docs.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`wasm/src/tamp.js`	Adds WASM wrapper support for `dictionary_reset`/`append`, implements `resetDictionary()`, and adjusts JS header parsing to allow 2-byte headers.
`wasm/Makefile`	Exports the new WASM symbol `tamp_compressor_reset_dictionary`.
`tests/test_compressor_decompressor.py`	Adds Python-level tests covering reset, multiple resets, and append mode roundtrips.
`tamp/decompressor.py`	Implements double-FLUSH detection and dictionary reinit in the pure-Python decompressor; supports second header byte (reserved).
`tamp/ctamp.pxd`	Extends Cython declarations to include `dictionary_reset`, `append`, and the reset function.
`tamp/compressor.py`	Implements pure-Python `reset_dictionary()`, adds `dictionary_reset`/`append`, and ensures trailing FLUSH behavior for append compatibility.
`tamp/_c_src/tamp/decompressor.h`	Adds internal state for incremental 2-byte header handling and double-FLUSH detection.
`tamp/_c_src/tamp/decompressor.c`	Implements 2-byte header reading, incremental implicit header parsing, and double-FLUSH-triggered dictionary reset.
`tamp/_c_src/tamp/compressor.h`	Documents and declares `tamp_compressor_reset_dictionary()` and append-mode init semantics.
`tamp/_c_src/tamp/compressor.c`	Implements append-mode init, flush behavior changes for dictionary_reset, and reset_dictionary logic.
`tamp/_c_src/tamp/common.h`	Adds `dictionary_reset` and `append` fields to `TampConf`.
`tamp/_c_decompressor.pyx`	Updates Cython decompressor to read the header incrementally until complete.
`tamp/_c_compressor.pyx`	Adds `dictionary_reset`/`append`, implements `reset_dictionary()`, and ensures close emits trailing FLUSH when needed.
`mpy_bindings/bindings_compressor.py`	Exposes `dictionary_reset`/`append` and `reset_dictionary()` in the MicroPython Python-level wrapper.
`mpy_bindings/bindings.c`	Extends MicroPython native bindings for `dictionary_reset`/`append` and adds a `reset_dictionary` method.
`docs/source/specification.rst`	Defines header byte 2 as reserved and specifies double-FLUSH dictionary reset semantics.
`docs/source/javascript.rst`	Documents JS usage of mid-stream dictionary reset.
`docs/source/c_library.rst`	Documents the C API for dictionary reset and append mode, and updated FLUSH behavior.
`ctests/test_runner.c`	Registers new C tests for dictionary reset behavior.
`ctests/test_compressor.c`	Adds C tests for reset roundtrip, required flag behavior, and small-buffer retry scenarios.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-25T19:30:07Z

            const pollResult = this.module.ccall(
              'tamp_compressor_poll',
              'number',
              ['number', 'number', 'number', 'number'],
              [this.compressorPtr, outputPtr + chunkOutputWritten, CHUNK_SIZE - chunkOutputWritten, pollOutputSizePtr]
            );

-            if (pollResult !== 0) {
+            if (pollResult !== TAMP_OK) {
              throwOnError(pollResult, 'Compression poll');
            }


tamp_compressor_poll can return positive non-OK statuses like TAMP_OUTPUT_FULL (see C API), but this branch only calls throwOnError, which ignores non-negative values. If the output buffer has <3 bytes remaining, poll may return TAMP_OUTPUT_FULL and this loop will continue without making progress (risking an infinite loop). Handle TAMP_OUTPUT_FULL explicitly (e.g., stop polling when remaining output space is too small, flush the current output chunk, and continue with a fresh buffer).

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

wasm/src/tamp.js:14

Only TAMP_OK is defined as a named status code, but later code still uses magic numbers for other non-error statuses (e.g. result === 2 for TAMP_INPUT_EXHAUSTED). Consider defining TAMP_OUTPUT_FULL and TAMP_INPUT_EXHAUSTED alongside TAMP_OK and updating comparisons to use them for consistency and readability.

// Status codes from C library
const TAMP_OK = 0;
const TAMP_ERROR = -1;
const TAMP_EXCESS_BITS = -2;
const TAMP_INVALID_CONF = -3;
const TAMP_OOB = -4;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-25T23:25:55Z

+    uint8_t conf_dictionary_reset : 1;  // Stream may contain double-FLUSH dictionary resets (from header byte 1 bit [0]
+                                        // / more_header)


The conf_dictionary_reset comment is missing a closing parenthesis, which makes the header harder to read. Please fix the comment formatting (and consider keeping the entire parenthetical on one line if possible).

Suggested change

uint8_t conf_dictionary_reset : 1; // Stream may contain double-FLUSH dictionary resets (from header byte 1 bit [0]

// / more_header)

uint8_t conf_dictionary_reset : 1; // Stream may contain double-FLUSH dictionary resets (from header byte 1 bit [0] / more_header)

Copilot · 2026-03-25T23:25:56Z

+When ``more_header`` is set, a FLUSH is **always** emitted (even when byte-aligned)
+to support append mode (see `Dictionary Reset (Double-FLUSH)`_).
 In the worst case (1 pending bit), a FLUSH symbol (9 bits) and 6 padding bits are


This note says a FLUSH is always emitted when more_header is set, but the preceding bullet list still states “If the output is already byte-aligned, no action is performed.” Please reconcile/clarify the flush behavior so the spec is unambiguous (e.g., make the “no action” case conditional on more_header being unset).

Copilot

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

BrianPugh added 14 commits March 13, 2026 20:22

private tamp_compressor_flush_impl

213ad6d

Add TampConf.dictionary_reset

4055376

C decompressor double-flush implementation

54f49c9

python double-flush decompressor

520653d

C compressor double-flush

354453d

Add python Compressor.reset_dictionary

23e6401

cython double-flush

84c6e24

add double-flush to micropython bindings

b8b9f8e

tests

ebcccbd

wasm/javascript resetDictionary

176474d

tamp_compressor_init_append

5900673

use conf.append instead of a separate tamp_compressor_init_append fun…

4b114f6

…ction

Fix buffer-padding when opening compressor in append mode

7cc1cca

update docs

8161c44

BrianPugh requested a review from Copilot March 25, 2026 16:27

Copilot started reviewing on behalf of BrianPugh March 25, 2026 16:27 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Comment thread tamp/_c_src/tamp/common.h Outdated

Comment thread tamp/_c_src/tamp/decompressor.h Outdated

Comment thread tamp/ctamp.pxd Outdated

Comment thread mpy_bindings/bindings.c

Comment thread wasm/src/tamp.js

Comment thread wasm/src/tamp.js

Comment thread wasm/src/tamp.js

BrianPugh and others added 2 commits March 25, 2026 13:41

Fix header-bits comments.

bf5f631

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

copilot feedback

e155c6f

BrianPugh requested a review from Copilot March 25, 2026 19:18

Copilot started reviewing on behalf of BrianPugh March 25, 2026 19:23 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

WASM: handle TAMP_OUTPUT_FULL from poll; add TAMP_OK constant

bfff40e

BrianPugh requested a review from Copilot March 25, 2026 20:57

Copilot started reviewing on behalf of BrianPugh March 25, 2026 20:57 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Comment thread wasm/src/tamp.js

Comment thread tamp/_c_src/tamp/decompressor.c

Comment thread docs/source/javascript.rst

misc

5d4c931

BrianPugh requested a review from Copilot March 25, 2026 23:16

Copilot started reviewing on behalf of BrianPugh March 25, 2026 23:18 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

address PR comments

d1c4c44

BrianPugh requested a review from Copilot March 26, 2026 20:03

Copilot started reviewing on behalf of BrianPugh March 26, 2026 20:03 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

BrianPugh merged commit 3ba7755 into main Mar 26, 2026
86 checks passed

BrianPugh deleted the reset-dictionary branch March 26, 2026 22:12

		uint8_t conf_dictionary_reset : 1; // Stream may contain double-FLUSH dictionary resets (from header byte 1 bit [0]
		// / more_header)

Uh oh!

Conversation

BrianPugh commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Protocol

API

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BrianPugh commented Mar 25, 2026 •

edited

Loading