Skip to content

msgpack + encode/decode issues and my fix (3 bugs) #26644

@Macho0x

Description

@Macho0x

Describe the bug

Bug Report: vlang/msgpack — Minimal Integer Encoding & Fixint Decoding

Repository: https://github.com/vlang/msgpack
Affects: All versions as of commit 55d09a0
Severity: High — produces wire-format bytes incompatible with every other MessagePack implementation


Summary

The vlang/msgpack library has three related bugs that together break interoperability with any spec-compliant MessagePack implementation (Python, Rust, Go, etc.):

  1. encode.v — Encoder always uses the widest integer type. encode(0) produces 5 bytes (d2 00 00 00 00) instead of 1 byte (00). This violates the MessagePack specification's requirement for minimal encoding.

  2. config.vpositive_int_unsigned defaults to false. Non-negative integers in the range 128–255 are encoded as 3-byte signed int16 instead of the correct 2-byte unsigned uint8.

  3. decode.v — Decoder ignores positive and negative fixint bytes. decode_integer and decode_to_json have no match arm for mp_pos_fix_int (0x000x7f) or mp_neg_fix_int (0xe00xff). Data encoded by any conforming library cannot be decoded by V's library.

The encoder bug was even acknowledged in the original source with a // TODO comment that was never completed:

// TODO: if int encode_int, if uint encode_uint
// instead of needing to check each type, also
// then we will be using the smallest storage

Background: The MessagePack Integer Format Family

The MessagePack specification defines the following integer formats, listed from smallest to largest:

Format Byte(s) Range
positive fixint 0xxxxxxx (1 byte, 0x000x7f) 0 to 127
negative fixint 111xxxxx (1 byte, 0xe00xff) -32 to -1
uint 8 0xcc + 1 byte 0 to 255
uint 16 0xcd + 2 bytes (big-endian) 0 to 65535
uint 32 0xce + 4 bytes (big-endian) 0 to 2^32−1
uint 64 0xcf + 8 bytes (big-endian) 0 to 2^64−1
int 8 0xd0 + 1 byte -128 to 127
int 16 0xd1 + 2 bytes (big-endian) -32768 to 32767
int 32 0xd2 + 4 bytes (big-endian) -2147483648 to 2147483647
int 64 0xd3 + 8 bytes (big-endian) -2^63 to 2^63−1

The spec mandates minimal encoding: an encoder MUST use the smallest format that can represent the value. Using int 32 (0xd2) to encode 0 when positive fixint (0x00) is available is a spec violation.


How This Was Discovered

These bugs were discovered while building a trading bot in V that communicates with the Hyperliquid perpetuals exchange. Hyperliquid requires order actions to be serialized with MessagePack before being hashed for EIP-712 Ethereum signing:

order_bytes = msgpack.encode(action)
action_hash = keccak256(order_bytes + nonce_bytes + vault_byte)
signature   = secp256k1_sign(eip712_hash(action_hash))

The exchange's server re-serializes the same action with a spec-compliant library and computes the same hash independently. If the bytes differ — even by a single byte in a single field — the hashes differ and the signature is invalid. Once the builder field (f: 0, a zero integer) was present in the serialized action, the mismatch became apparent: V encoded 0 as d2 00 00 00 00 (5 bytes) while the server expected 00 (1 byte), producing a completely different hash and an HTTP 422 rejection on every request.

The same failure will occur with any V program communicating with Python's msgpack, Rust's rmp/rmp-serde, Go's vmihailenco/msgpack, or any other spec-compliant implementation.


Reproduction Steps

Bug 1: Encoder — Fixed-Width Instead of Minimal Encoding

File: encode.v

Reproduction Steps

import msgpack
import encoding.hex

fn main() {
    // Encode a plain integer
    result := msgpack.encode(0)
    println(result.hex())  // prints: d200000000
                           // should print: 00

    result2 := msgpack.encode(42)
    println(result2.hex()) // prints: d20000002a
                           // should print: 2a

    // Encode a struct with an integer field
    result3 := msgpack.encode(struct{ age int }{age: 30})
    println(result3.hex()) // field value encoded as int32 (d2 00 00 00 1e)
                           // should be positive fixint (1e)
}

Current Behaviour

The encode[T]() generic function dispatches on V's compile-time type and calls a fixed-width helper directly, completely bypassing the existing encode_int and encode_uint functions that already implement minimal encoding:

// Current (buggy) code in encode.v
$else $if T is i8 {
    e.encode_i8(data)    // always writes 0xd0 + 1 byte (2 bytes total), even for value 0
} $else $if T is i16 {
    e.encode_i16(data)   // always writes 0xd1 + 2 bytes (3 bytes total)
} $else $if T is int {
    e.encode_i32(data)   // always writes 0xd2 + 4 bytes (5 bytes total)
} $else $if T is i64 {
    e.encode_i64(data)   // always writes 0xd3 + 8 bytes (9 bytes total)
} $else $if T is u8 {
    e.encode_u8(data)    // always writes 0xcc + 1 byte (2 bytes total)
} $else $if T is u16 {
    e.encode_u16(data)   // always writes 0xcd + 2 bytes (3 bytes total)
} $else $if T is u32 {
    e.encode_u32(data)   // always writes 0xce + 4 bytes (5 bytes total)
} $else $if T is u64 {
    e.encode_u64(data)   // always writes 0xcf + 8 bytes (9 bytes total)
}

Actual byte output:

encode(0)    → d2 00 00 00 00  (5 bytes: int32 format)
encode(42)   → d2 00 00 00 2a  (5 bytes: int32 format)
encode(-123) → d2 ff ff ff 85  (5 bytes: int32 format)
encode(1)    → d2 00 00 00 01  (5 bytes: int32 format)

Note also that i32 has no case at all — it silently falls through to an unhandled branch.

Expected Behaviour

Per the MessagePack specification, the encoder MUST select the smallest format that can represent the value:

encode(0)    → 00              (1 byte:  positive fixint)
encode(42)   → 2a              (1 byte:  positive fixint)
encode(-123) → d0 85           (2 bytes: int8 format)
encode(1)    → 01              (1 byte:  positive fixint)

The encode_int and encode_uint functions in the same file already implement this logic correctly — they just are not being called.

Possible Solution

Route all integer types through encode_int or encode_uint, which already select the minimal format, and add the missing i32 case:

$else $if T is i8 {
    e.encode_int(i64(data))
} $else $if T is i16 {
    e.encode_int(i64(data))
} $else $if T is int {
    e.encode_int(i64(data))
} $else $if T is i32 {
    e.encode_int(i64(data))   // previously missing case
} $else $if T is i64 {
    e.encode_int(data)
} $else $if T is u8 {
    e.encode_uint(u64(data))
} $else $if T is u16 {
    e.encode_uint(u64(data))
} $else $if T is u32 {
    e.encode_uint(u64(data))
} $else $if T is u64 {
    e.encode_uint(data)
}

The // TODO comment acknowledging this fix should be removed at the same time.


Bug 2: Config — positive_int_unsigned Defaults to false

File: config.v

Reproduction Steps

import msgpack

fn main() {
    // Value in the range 128-255 (fits in u8, not in i8)
    result := msgpack.encode(200)
    println(result.hex())  // prints: d100c8  (3 bytes: signed int16)
                           // should print: ccc8  (2 bytes: unsigned uint8)
}

Current Behaviour

The default_config() function returns positive_int_unsigned: false. The encode_int function has a branch that checks this flag:

pub fn (mut e Encoder) encode_int(i i64) {
    if e.config.positive_int_unsigned && i >= 0 {
        e.encode_uint(u64(i))  // compact unsigned path — NOT TAKEN by default
    } else if i > max_i8 {    // 127
        if i <= max_i16 {      // 32767
            e.encode_i16(i16(i))  // 200 ends up here: 3 bytes as signed int16
        }
        ...
    }
}

With the flag false, a non-negative value of 200 is larger than max_i8 (127), so it takes the signed path and is encoded as a 3-byte signed int16 (d1 00 c8) even though it fits in a 2-byte unsigned uint8 (cc c8). Values 0–127 happen to encode correctly by coincidence (they fall into the negative fixint check i >= -32 and write a single byte), masking the bug for the most common values.

Expected Behaviour

Any non-negative integer should be encoded using the unsigned format family, which is always more compact than or equal in size to the signed family for non-negative values. The value 200 fits in uint 8 and should produce 2 bytes: cc c8.

Possible Solution

Enable positive_int_unsigned in the default config:

pub fn default_config() Config {
    return Config{
        write_ext:             true
        positive_int_unsigned: true   // non-negative integers always use unsigned format
    }
}

Bug 3: Decoder — Fixint Bytes Not Handled

File: decode.v

Reproduction Steps

import msgpack

fn main() {
    // Encode with any compliant library, or just construct the bytes manually.
    // Positive fixint for value 42 is a single byte: 0x2a
    data := [u8(0x2a)]

    mut val := 0
    mut decoder := msgpack.new_decoder()
    decoder.decode(data, mut val) or { println('error: ${err}') }
    // Prints: error (or panics) — should print 42

    // Same problem for negative fixint: -5 is 0xfb
    data2 := [u8(0xfb)]
    mut val2 := 0
    decoder.decode(data2, mut val2) or { println('error: ${err}') }
    // Prints: error — should print -5
}

This also means that after Bug 1 is fixed, V cannot even decode its own output: msgpack.encode(0) produces [0x00], and decode_integer has no handler for 0x00.

Expected Behavior

Expected Behaviour

For a positive fixint byte (0x000x7f), the format byte itself IS the integer value — no additional bytes follow. For a negative fixint byte (0xe00xff), the format byte reinterpreted as a signed i8 IS the value. Both ranges must be handled.

decode([0x00]) → 0
decode([0x2a]) → 42
decode([0x7f]) → 127
decode([0xe0]) → -32
decode([0xfb]) → -5
decode([0xff]) → -1

Current Behavior

Current Behaviour

Both decode_integer and decode_to_json use a match on the format byte d.bd. Neither has arms for the positive fixint range (0x000x7f) or the negative fixint range (0xe00xff):

// Current (buggy) decode_integer
pub fn (mut d Decoder) decode_integer[T](mut val T) ! {
    data := d.buffer
    match d.bd {
        // 0x00–0x7f: positive fixint — NO HANDLER, falls to else/error
        // 0xe0–0xff: negative fixint — NO HANDLER, falls to else/error
        mp_u8 { val = data[d.pos]; d.pos++ }
        mp_u16 { ... }
        // ...
    }
}

// Current (buggy) decode_to_json
match d.bd {
    // 0x00–0x7f: positive fixint — NO HANDLER
    // 0xe0–0xff: negative fixint — NO HANDLER
    mp_u8, mp_u16, mp_u32, mp_u64, mp_i8, mp_i16, mp_i32, mp_i64 {
        ...
    }
}

Any integer value in the range -32 to 127, when encoded by a compliant library (including the fixed V encoder), produces a single-byte fixint. That byte is completely unrecognised by V's decoder.

Possible Solution

Possible Solution

Add match arms for both fixint ranges in both decode_integer and decode_to_json:

// Fixed decode_integer
pub fn (mut d Decoder) decode_integer[T](mut val T) ! {
    data := d.buffer
    match d.bd {
        mp_pos_fix_int_min...mp_pos_fix_int_max {
            val = d.bd          // format byte IS the value (0–127), no extra bytes
        }
        mp_neg_fix_int_min...mp_neg_fix_int_max {
            val = i8(d.bd)      // format byte reinterpreted as signed (-32 to -1)
        }
        mp_u8 { val = data[d.pos]; d.pos++ }
        // ... rest unchanged
    }
}

// Fixed decode_to_json
match d.bd {
    mp_pos_fix_int_min...mp_pos_fix_int_max {
        int_val := int(d.bd)
        unsafe { result.push_many(int_val.str().str, int_val.str().len) }
    }
    mp_neg_fix_int_min...mp_neg_fix_int_max {
        int_val := int(i8(d.bd))
        unsafe { result.push_many(int_val.str().str, int_val.str().len) }
    }
    mp_u8, mp_u16, mp_u32, mp_u64, mp_i8, mp_i16, mp_i32, mp_i64 {
        // ... unchanged
    }
}

Test Expectations Must Be Updated

File: encode_test.v

The existing test assertions were written to match the buggy output. They must be corrected:

// BEFORE — asserting the buggy, non-spec-compliant output
assert msgpack.encode(0)          == hex.decode('d200000000')! // int32: 5 bytes
assert msgpack.encode(42)         == hex.decode('d20000002a')! // int32: 5 bytes
assert msgpack.encode(-123)       == hex.decode('d2ffffff85')! // int32: 5 bytes
assert msgpack.encode([0])        == hex.decode('91d200000000')!
assert msgpack.encode([1, 2, 3])  == hex.decode('93d200000001d200000002d200000003')!
assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea162d20000001e')!

// AFTER — asserting correct, spec-compliant minimal encoding
assert msgpack.encode(0)          == hex.decode('00')!          // positive fixint: 1 byte
assert msgpack.encode(42)         == hex.decode('2a')!          // positive fixint: 1 byte
assert msgpack.encode(-123)       == hex.decode('d085')!        // int8: 2 bytes
assert msgpack.encode([0])        == hex.decode('9100')!        // array + fixint
assert msgpack.encode([1, 2, 3])  == hex.decode('93010203')!    // 3× fixint
assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea1621e')!

Complete Diff

config.v

 pub fn default_config() Config {
 	return Config{
-		write_ext: true
+		write_ext:             true
+		positive_int_unsigned: true
 	}
 }

encode.v

-	// TODO: if int encode_int, if uint encode_uint
-	// instead of needing to check each type, also
-	// then we will be using the smallest storage
 	$else $if T is i8 {
-		e.encode_i8(data)
+		e.encode_int(i64(data))
 	} $else $if T is i16 {
-		e.encode_i16(data)
+		e.encode_int(i64(data))
 	} $else $if T is int {
-		e.encode_i32(data)
+		e.encode_int(i64(data))
+	} $else $if T is i32 {
+		e.encode_int(i64(data))
 	} $else $if T is i64 {
-		e.encode_i64(data)
+		e.encode_int(data)
 	} $else $if T is u8 {
-		e.encode_u8(data)
+		e.encode_uint(u64(data))
 	} $else $if T is u16 {
-		e.encode_u16(data)
+		e.encode_uint(u64(data))
 	} $else $if T is u32 {
-		e.encode_u32(data)
+		e.encode_uint(u64(data))
 	} $else $if T is u64 {
-		e.encode_u64(data)
+		e.encode_uint(data)
 	}

decode.v

+		mp_pos_fix_int_min...mp_pos_fix_int_max {
+			int_val := int(d.bd)
+			unsafe { result.push_many(int_val.str().str, int_val.str().len) }
+		}
+		mp_neg_fix_int_min...mp_neg_fix_int_max {
+			int_val := int(i8(d.bd))
+			unsafe { result.push_many(int_val.str().str, int_val.str().len) }
+		}
 		mp_u8, mp_u16, mp_u32, mp_u64, mp_i8, mp_i16, mp_i32, mp_i64 {
 pub fn (mut d Decoder) decode_integer[T](mut val T) ! {
 	data := d.buffer
 	match d.bd {
+		mp_pos_fix_int_min...mp_pos_fix_int_max {
+			val = d.bd
+		}
+		mp_neg_fix_int_min...mp_neg_fix_int_max {
+			val = i8(d.bd)
+		}
 		mp_u8 {

encode_test.v

-	assert msgpack.encode(0) == hex.decode('d200000000')!
-	assert msgpack.encode(42) == hex.decode('d20000002a')!
-	assert msgpack.encode(-123) == hex.decode('d2ffffff85')!
+	assert msgpack.encode(0) == hex.decode('00')!
+	assert msgpack.encode(42) == hex.decode('2a')!
+	assert msgpack.encode(-123) == hex.decode('d085')!

-	assert msgpack.encode([0]) == hex.decode('91d200000000')!
+	assert msgpack.encode([0]) == hex.decode('9100')!
-	assert msgpack.encode([1, 2, 3]) == hex.decode('93d200000001d200000002d200000003')! // REVIEW
+	assert msgpack.encode([1, 2, 3]) == hex.decode('93010203')!

-	assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea162d20000001e')!
+	assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea1621e')!

Verification

After applying all fixes, all three existing test suites pass with no modifications beyond the corrected assertions in encode_test.v:

---- Testing... ----------------------------------------------------------------
OK    [1/3]  decode_test.v
OK    [2/3]  encode_test.v
OK    [3/3]  decode_to_json_test.v
--------------------------------------------------------------------------------
Summary: 3 passed, 3 total.

Cross-Language Validation

The fixed V output matches Python's msgpack library byte-for-byte:

import msgpack
assert msgpack.packb(0)        == b'\x00'
assert msgpack.packb(42)       == b'\x2a'
assert msgpack.packb(-123)     == b'\xd0\x85'
assert msgpack.packb([0])      == b'\x91\x00'
assert msgpack.packb([1,2,3])  == b'\x93\x01\x02\x03'

References

Additional Information/Context

No response

V version

0.5

Environment details (OS name and version, etc.)

CachyOS

Note

You can use the 👍 reaction to increase the issue's priority for developers.

Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugThis tag is applied to issues which reports bugs.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions