fix: encoder silent row drop, negative/non-decimal array lengths, trailing delimiter, colon re-search by jamison · Pull Request #302 · toon-format/toon

jamison · 2026-05-13T08:43:06Z

Summary

Five bugs found via code review of the encoder and decoder, fixed with passing tests (466 tests pass).

Bug 1 — Encoder data loss: non-primitive sub-arrays silently dropped (HIGH)

encodeArrayOfArraysAsListItemsLines in encoders.ts only emitted list items for sub-arrays that pass isArrayOfPrimitives. Sub-arrays containing objects or nested arrays were skipped silently. The header declared [N] rows but fewer were emitted, producing structurally invalid TOON that then failed on decode with a confusing count mismatch.

Fix: added an else branch that falls back to encodeMixedArrayAsListItemsLines for non-primitive sub-arrays.

Bug 2 — Negative array lengths accepted (MEDIUM)

parseBracketSegment in parser.ts checked only Number.isNaN(length). Number.parseInt("-5", 10) returns -5, not NaN, so negative lengths passed through. The decoder looped while (rowCount < -5), consuming zero rows, and assertExpectedCount(0, -5, ...) produced a confusing error.

Fix: added length < 0 to the rejection condition.

Bug 3 — Hex and float array lengths silently accepted (MEDIUM)

parseInt("0x10") → 16, parseInt("3.7") → 3. Neither is a valid TOON array length but both passed the isNaN check.

Fix: added a /^-?\d+$/ pre-check before parseInt to require a clean decimal integer string.

Bug 4 — Trailing delimiter produces spurious empty value (MEDIUM)

parseDelimitedValues("a,b,", ",") returned ["a", "b", ""] instead of ["a", "b"]. After the loop, the condition if (valueBuffer || values.length > 0) pushed an empty valueBuffer because values.length > 0 was true. In strict mode this caused a spurious field-count assertion failure; in non-strict mode it silently added an extra empty field.

Fix: changed the post-loop condition to if (valueBuffer.trim() || values.length === 0).

Bug 5 — Colon position re-derived instead of using `parseKeyToken`'s `end` (LOW-MEDIUM)

decodeKeyValueSync and decodeKeyValueAsync both discarded end from parseKeyToken and re-searched for the colon via content.indexOf(COLON, key.length). This works today because all current escape sequences expand length, so key.length never overshoots the separator. It will silently break if a future escape sequence maps a multi-character source to a shorter decoded key.

Fix: destructure end from parseKeyToken and use content.slice(end).trim() in both paths.

Test plan

pnpm --filter @toon-format/toon test — 466 passed, 0 failed
All existing tests pass without modification
Note: cases 1–4 are not covered by existing tests — conformance fixtures for these edge cases would be a worthwhile addition

…iling delimiter, and colon re-search Five bugs fixed across encoder and decoder: 1. encoders.ts - encodeArrayOfArraysAsListItemsLines silently dropped non-primitive sub-arrays. The declared header count said [N] but fewer rows were emitted, producing structurally invalid TOON output that then failed on decode with a confusing count mismatch. Fixed by falling back to encodeMixedArrayAsListItemsLines for non-primitive sub-arrays. 2. parser.ts - parseBracketSegment accepted negative array lengths because Number.parseInt("-5") is not NaN. Added length < 0 guard. 3. parser.ts - parseBracketSegment accepted hex (0x10 → 16) and float (3.7 → 3) lengths via parseInt's permissive behaviour. Added a /^-?\d+$/ pre-check to require a clean decimal integer string. 4. parser.ts - parseDelimitedValues produced a spurious empty final element on trailing delimiters (e.g. "a,b,"). The post-loop condition `values.length > 0` fired even when valueBuffer was empty, appending "". Fixed to only push when valueBuffer.trim() is non-empty or no values have been collected yet (empty-input case). 5. decoders.ts - decodeKeyValueSync and decodeKeyValueAsync discarded the `end` return value from parseKeyToken and re-searched for the colon via indexOf. This currently works by accident (escape sequences only expand length) but is fragile. Fixed to use `end` directly in both paths.

johannschopplich · 2026-05-17T16:38:04Z

This was generated by AI during triage.

Thanks for the bug hunt. Going through each in order:

Bug 1 (encoder silent row drop) – real bug. The encodeArrayOfArraysAsListItemsLines path is reachable for arrays-of-arrays-of-non-primitives (e.g. [[{a:1}]] or [[[1]]]), where the outer isArrayOfArrays gate passes but inner sub-arrays carry objects or further arrays. Please add a fixture in packages/toon/test/decode-errors.test.ts (or the appropriate encode-test file) covering that shape so the fix is locked in.

Bug 2 (negative array length) – real bug, fix is correct. Please add a fixture in packages/toon/test/decode-errors.test.ts matching the existing style.

Bug 3 (hex/float array length) – real bug, regex pre-check is correct. Same ask: one fixture per case (hex, float). The subsequent Number.isNaN(length) check becomes dead but harmless.

Bug 4 (trailing delimiter) – real bug. The fix subtly changes behavior for single-value inputs with trailing whitespace ("a, " no longer yields a trailing empty). Please add a deliberate test for that case so the intended behavior is locked in.

Bug 5 (colon re-search) – provably-equivalent refactor, not a bugfix. Happy to take it as cleanup if you reframe the title/description; calling it a fix is misleading.

Tests for all four real bugs are the gating item.

jamison mentioned this pull request May 13, 2026

fix(decoder): reject negative array lengths in parse_header toon-format/toon-python#59

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: encoder silent row drop, negative/non-decimal array lengths, trailing delimiter, colon re-search#302

fix: encoder silent row drop, negative/non-decimal array lengths, trailing delimiter, colon re-search#302
jamison wants to merge 1 commit into
toon-format:mainfrom
jamison:fix/encoder-parser-bugs

jamison commented May 13, 2026

Uh oh!

johannschopplich commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jamison commented May 13, 2026

Summary

Bug 1 — Encoder data loss: non-primitive sub-arrays silently dropped (HIGH)

Bug 2 — Negative array lengths accepted (MEDIUM)

Bug 3 — Hex and float array lengths silently accepted (MEDIUM)

Bug 4 — Trailing delimiter produces spurious empty value (MEDIUM)

Bug 5 — Colon position re-derived instead of using parseKeyToken's end (LOW-MEDIUM)

Test plan

Uh oh!

johannschopplich commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug 5 — Colon position re-derived instead of using `parseKeyToken`'s `end` (LOW-MEDIUM)