Skip to content

fix: encoder silent row drop, negative/non-decimal array lengths, trailing delimiter, colon re-search#302

Open
jamison wants to merge 1 commit into
toon-format:mainfrom
jamison:fix/encoder-parser-bugs
Open

fix: encoder silent row drop, negative/non-decimal array lengths, trailing delimiter, colon re-search#302
jamison wants to merge 1 commit into
toon-format:mainfrom
jamison:fix/encoder-parser-bugs

Conversation

@jamison

@jamison jamison commented May 13, 2026

Copy link
Copy Markdown

Summary

Five bugs found via code review of the encoder and decoder, fixed with passing tests (466 tests pass).


Bug 1 β€” Encoder data loss: non-primitive sub-arrays silently dropped (HIGH)

encodeArrayOfArraysAsListItemsLines in encoders.ts only emitted list items for sub-arrays that pass isArrayOfPrimitives. Sub-arrays containing objects or nested arrays were skipped silently. The header declared [N] rows but fewer were emitted, producing structurally invalid TOON that then failed on decode with a confusing count mismatch.

Fix: added an else branch that falls back to encodeMixedArrayAsListItemsLines for non-primitive sub-arrays.


Bug 2 β€” Negative array lengths accepted (MEDIUM)

parseBracketSegment in parser.ts checked only Number.isNaN(length). Number.parseInt("-5", 10) returns -5, not NaN, so negative lengths passed through. The decoder looped while (rowCount < -5), consuming zero rows, and assertExpectedCount(0, -5, ...) produced a confusing error.

Fix: added length < 0 to the rejection condition.


Bug 3 β€” Hex and float array lengths silently accepted (MEDIUM)

parseInt("0x10") β†’ 16, parseInt("3.7") β†’ 3. Neither is a valid TOON array length but both passed the isNaN check.

Fix: added a /^-?\d+$/ pre-check before parseInt to require a clean decimal integer string.


Bug 4 β€” Trailing delimiter produces spurious empty value (MEDIUM)

parseDelimitedValues("a,b,", ",") returned ["a", "b", ""] instead of ["a", "b"]. After the loop, the condition if (valueBuffer || values.length > 0) pushed an empty valueBuffer because values.length > 0 was true. In strict mode this caused a spurious field-count assertion failure; in non-strict mode it silently added an extra empty field.

Fix: changed the post-loop condition to if (valueBuffer.trim() || values.length === 0).


Bug 5 β€” Colon position re-derived instead of using parseKeyToken's end (LOW-MEDIUM)

decodeKeyValueSync and decodeKeyValueAsync both discarded end from parseKeyToken and re-searched for the colon via content.indexOf(COLON, key.length). This works today because all current escape sequences expand length, so key.length never overshoots the separator. It will silently break if a future escape sequence maps a multi-character source to a shorter decoded key.

Fix: destructure end from parseKeyToken and use content.slice(end).trim() in both paths.


Test plan

  • pnpm --filter @toon-format/toon test β€” 466 passed, 0 failed
  • All existing tests pass without modification
  • Note: cases 1–4 are not covered by existing tests β€” conformance fixtures for these edge cases would be a worthwhile addition

…iling delimiter, and colon re-search

Five bugs fixed across encoder and decoder:

1. encoders.ts - encodeArrayOfArraysAsListItemsLines silently dropped
   non-primitive sub-arrays. The declared header count said [N] but fewer
   rows were emitted, producing structurally invalid TOON output that then
   failed on decode with a confusing count mismatch. Fixed by falling back
   to encodeMixedArrayAsListItemsLines for non-primitive sub-arrays.

2. parser.ts - parseBracketSegment accepted negative array lengths because
   Number.parseInt("-5") is not NaN. Added length < 0 guard.

3. parser.ts - parseBracketSegment accepted hex (0x10 β†’ 16) and float
   (3.7 β†’ 3) lengths via parseInt's permissive behaviour. Added a
   /^-?\d+$/ pre-check to require a clean decimal integer string.

4. parser.ts - parseDelimitedValues produced a spurious empty final element
   on trailing delimiters (e.g. "a,b,"). The post-loop condition
   `values.length > 0` fired even when valueBuffer was empty, appending "".
   Fixed to only push when valueBuffer.trim() is non-empty or no values
   have been collected yet (empty-input case).

5. decoders.ts - decodeKeyValueSync and decodeKeyValueAsync discarded the
   `end` return value from parseKeyToken and re-searched for the colon via
   indexOf. This currently works by accident (escape sequences only expand
   length) but is fragile. Fixed to use `end` directly in both paths.
@johannschopplich

Copy link
Copy Markdown
Collaborator

This was generated by AI during triage.

Thanks for the bug hunt. Going through each in order:

Bug 1 (encoder silent row drop) – real bug. The encodeArrayOfArraysAsListItemsLines path is reachable for arrays-of-arrays-of-non-primitives (e.g. [[{a:1}]] or [[[1]]]), where the outer isArrayOfArrays gate passes but inner sub-arrays carry objects or further arrays. Please add a fixture in packages/toon/test/decode-errors.test.ts (or the appropriate encode-test file) covering that shape so the fix is locked in.

Bug 2 (negative array length) – real bug, fix is correct. Please add a fixture in packages/toon/test/decode-errors.test.ts matching the existing style.

Bug 3 (hex/float array length) – real bug, regex pre-check is correct. Same ask: one fixture per case (hex, float). The subsequent Number.isNaN(length) check becomes dead but harmless.

Bug 4 (trailing delimiter) – real bug. The fix subtly changes behavior for single-value inputs with trailing whitespace ("a, " no longer yields a trailing empty). Please add a deliberate test for that case so the intended behavior is locked in.

Bug 5 (colon re-search) – provably-equivalent refactor, not a bugfix. Happy to take it as cleanup if you reframe the title/description; calling it a fix is misleading.

Tests for all four real bugs are the gating item.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants