-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Description
ProtobufDeserializer.deserialize() crashes with an unhelpful TypeError when message index resolution encounters out-of-range values:
TypeError: Cannot read properties of undefined (reading 'nestedMessages')
at ProtobufDeserializer.toNestedMessageDesc (.../serde/protobuf.js:371:48)
at ProtobufDeserializer.toMessageDescFromIndexes (.../serde/protobuf.js:364:21)
at ProtobufDeserializer.deserialize (.../serde/protobuf.js:307:34)
There are two related problems:
-
No bounds checking in
toMessageDescFromIndexes/toNestedMessageDesc: Whenfd.messages[index]returnsundefined(index out of range), the code passesundefinedtotoNestedMessageDesc, which crashes accessing.nestedMessagesonundefined. -
readMessageIndexesreads protobuf payload bytes as message indexes when indexes are absent: When a producer does not include message index bytes in the wire format (valid for single-message schemas — the first message is the implicit default),readMessageIndexesreads the first byte of the protobuf payload as the varint count, producing a non-zero garbage value. It then reads more payload bytes as index varints, generating nonsensical out-of-range indexes that crash the deserializer.
This is a separate bug from #455 (file-level enums). Both can occur on the same codebase — #455 affects schemas with file-level enums, while this one affects any protobuf topic where the producer omits message index bytes.
Environment
@confluentinc/schemaregistry: 1.8.2@confluentinc/kafka-javascript: 1.8.2- Node: 22.15.0
- OS: macOS (darwin arm64)
Minimal reproduction
// Standard consumer usage — nothing unusual.
import { SchemaRegistryClient } from '@confluentinc/schemaregistry';
import { ProtobufDeserializer } from '@confluentinc/schemaregistry/serde/protobuf';
const client = new SchemaRegistryClient({
baseURLs: ['https://schema-registry.example.com'],
basicAuthCredentials: { credentialsSource: 'USER_INFO', userInfo: 'user:pass' },
});
const deserializer = new ProtobufDeserializer(client, 1 /* VALUE */, {});
// Inside Kafka consumer eachMessage handler:
const decoded = await deserializer.deserialize(topic, message.value);
// => TypeError: Cannot read properties of undefined (reading 'nestedMessages')The schema is a single-message proto (one top-level message type). The message is produced by an upstream Confluent producer and read directly from Kafka — no custom wire format manipulation.
What happens internally: The producer omits the message index array for single-message schemas. readMessageIndexes reads the first protobuf payload byte (0x0a) as the varint count, zigzag-decodes it to 5, then reads 5 more payload bytes as indexes — producing [-3, 52, -51, 54, 54]. These garbage values crash toMessageDescFromIndexes.
Root cause
1. readMessageIndexes reads payload bytes as indexes when indexes are absent
// serde.js — SchemaId.readMessageIndexes
readMessageIndexes(payload) {
const bw = new BufferWrapper(payload);
const count = bw.readVarInt(); // ← reads first byte of protobuf payload if indexes are absent
if (count == 0) {
return [1, [0]];
}
const msgIndexes = [];
for (let i = 0; i < count; i++) {
msgIndexes.push(bw.readVarInt()); // ← reads more payload bytes as indexes
}
return [bw.pos, msgIndexes];
}When the producer omits the message index array, the slice passed to readMessageIndexes starts at the protobuf payload. For example, a MyMessage { name: "hello" } payload starts with 0x0a (field 1, wire type 2). readVarInt() zigzag-decodes this as 5, so it reads 5 more varints from the payload — producing garbage indexes.
2. toMessageDescFromIndexes crashes on undefined — no bounds checking
// protobuf.js — ProtobufDeserializer
toMessageDescFromIndexes(fd, msgIndexes) {
let index = msgIndexes[0];
if (msgIndexes.length === 1) {
return fd.messages[index]; // ← undefined if index >= fd.messages.length
}
return this.toNestedMessageDesc(fd.messages[index], msgIndexes.slice(1));
}
toNestedMessageDesc(parent, msgIndexes) {
let index = msgIndexes[0];
if (msgIndexes.length === 1) {
return parent.nestedMessages[index]; // ← TypeError if parent is undefined
}
return this.toNestedMessageDesc(parent.nestedMessages[index], msgIndexes.slice(1));
}Expected behavior
-
Bounds checking:
toMessageDescFromIndexesandtoNestedMessageDescshould validate that indexes are within range and throw a descriptive error (e.g.,"message index 15 out of range, schema has 1 top-level message(s)"). -
Graceful fallback for absent indexes:
readMessageIndexesshould handle the case where message indexes are absent for single-message schemas. For example, if the parsed count exceeds a reasonable threshold or doesn't match the schema's message count, default to[0].
Impact
This crashes deserialization for any protobuf topic where the producer omits message index bytes for single-message schemas. The TypeError: Cannot read properties of undefined stack trace gives no indication of the actual problem, making it extremely difficult to debug without reading the library source.
We hit this in QA on a billing pipeline consuming from a topic written by an upstream Confluent producer. The workaround was replacing ProtobufDeserializer entirely with a custom deserializer that parses the Confluent wire format manually.
Note: The same class of bug exists in confluent-kafka-go — the Go client panics with runtime error: index out of range [-8] under the same conditions.