Skip to content

Discrepancy in Dictionary Value in streaming IPC message #65

@manojVivek

Description

@manojVivek

Hi, we are in the process of migrating an application from apache-arrow to flechette and we ran into an unit test failure that sounds like a bug in Flechette's streaming IPC handling. Below is a minimal reproduction code:

import { tableFromIPC } from '@uwdata/flechette';

// IPC stream chunks:
const chunksBase64 = [
  '3AAAABAAAAAAAAoADAAKAAkABAAKAAAAEAAAAAABBAAIAAgAAAAEAAgAAAAEAAAAAQAAABQAAAAQABQAEAAOAA8ABAAAAAgAEAAAABgAAAAMAAAAAAABDXAAAAABAAAAGAAAALD///8QABgAFAAOAA8ABAAQAAgAEAAAADwAAAAwAAAAAAABBRAAAAAwAAAACAAKAAAABAAIAAAADAAAAAAABgAIAAQABgAAACAAAAAAAAAABAAEAAQAAAAEAAAAbm9kZQAAAAATAAAAYXR0cmlidXRlc19yZXNvdXJjZQA=',
  'qAAAABAAAAAMABgAFgAVAAQACAAMAAAAHAAAAMAAAAAAAAAAAAAAAAACBAAIAAoAAAAEAAgAAAAQAAAAAAAKABgADAAIAAQACgAAACwAAAAQAAAAAQAAAAAAAAAAAAAAAQAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAgAAAAAAAAAgAAAAAAAAAAzAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZ2tlLWV1cm9wZS13ZXN0My0wLXByZWVtcHRpYmxlLXQyZC1zdC1lYzI3ZDNkYi1wd3d6AAAAAAAAAAAAAAAAAA==',
  'qAAAABAAAAAMABoAGAAXAAQACAAMAAAAIAAAAMAAAAAAAAAAAAAAAAAAAAMEAAoAGAAMAAgABAAKAAAAPAAAABAAAAABAAAAAAAAAAAAAAACAAAAAQAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAEAAAAAAAAAgAAAAAAAAAAEAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==',
  'qAAAABAAAAAMABgAFgAVAAQACAAMAAAAHAAAAAABAAAAAAAAAAAAAAACBAAIAAoAAAAEAAgAAAAQAAAAAAAKABgADAAIAAQACgAAACwAAAAQAAAAAgAAAAAAAAAAAAAAAQAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAwAAAAAAAAAgAAAAAAAAABmAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAMwAAAGYAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAZ2tlLWV1cm9wZS13ZXN0My0wLXByZWVtcHRpYmxlLXQyZC1zdC1hZGQxOTQzNS13NzR2Z2tlLWV1cm9wZS13ZXN0My0wLXByZWVtcHRpYmxlLXQyZC1zdC03MTc4ODhkYi1ucmZyAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=',
  'qAAAABAAAAAMABoAGAAXAAQACAAMAAAAIAAAAMAAAAAAAAAAAAAAAAAAAAMEAAoAGAAMAAgABAAKAAAAPAAAABAAAAACAAAAAAAAAAAAAAACAAAAAgAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAAAAAAQAAAAAAAABAAAAAAAAAAAEAAAAAAAAAgAAAAAAAAAAIAAAAAAAAAP8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==',
];

const chunks = chunksBase64.map(b64 => {
  const binary = atob(b64);
  const bytes = new Uint8Array(binary.length);
  for (let i = 0; i < binary.length; i++) {
    bytes[i] = binary.charCodeAt(i);
  }
  return bytes;
});

const table = tableFromIPC(chunks);
const col = table.getChildAt(0);

console.log('Row 0:', col.at(0).node);
// Expected: "...ec27d3db-pwwz" (original dictionary)
// Actual:   "...add19435-w74v"

console.log('Row 1:', col.at(1).node);  // Expected: "...add19435-w74v"
console.log('Row 2:', col.at(2).node);  // Expected: "...717888db-nrfr"

Basically, the value at row[0].node should be gke-europe-west3-0-preemptible-t2d-st-ec27d3db-pwwz but it is gke-europe-west3-0-preemptible-t2d-st-add19435-w74v which is wrong.

I did some initial debugging with Claude and it suggests a bug in the dictionary replacement support.

Let me know what do you think. I can work on a fix, if you can confirm this is a bug.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions