msgpack + encode/decode issues and my fix (3 bugs)

### Describe the bug

# Bug Report: `vlang/msgpack` — Minimal Integer Encoding & Fixint Decoding

**Repository:** https://github.com/vlang/msgpack  
**Affects:** All versions as of commit `55d09a0`  
**Severity:** High — produces wire-format bytes incompatible with every other MessagePack implementation

---

## Summary

The `vlang/msgpack` library has three related bugs that together break interoperability with any spec-compliant MessagePack implementation (Python, Rust, Go, etc.):

1. **`encode.v` — Encoder always uses the widest integer type.** `encode(0)` produces 5 bytes (`d2 00 00 00 00`) instead of 1 byte (`00`). This violates the MessagePack specification's requirement for minimal encoding.

2. **`config.v` — `positive_int_unsigned` defaults to `false`.** Non-negative integers in the range 128–255 are encoded as 3-byte signed int16 instead of the correct 2-byte unsigned uint8.

3. **`decode.v` — Decoder ignores positive and negative fixint bytes.** `decode_integer` and `decode_to_json` have no `match` arm for `mp_pos_fix_int` (`0x00`–`0x7f`) or `mp_neg_fix_int` (`0xe0`–`0xff`). Data encoded by any conforming library cannot be decoded by V's library.

The encoder bug was even acknowledged in the original source with a `// TODO` comment that was never completed:

```v
// TODO: if int encode_int, if uint encode_uint
// instead of needing to check each type, also
// then we will be using the smallest storage
```

---

## Background: The MessagePack Integer Format Family

The [MessagePack specification](https://github.com/msgpack/msgpack/blob/master/spec.md#int-format-family) defines the following integer formats, listed from smallest to largest:

| Format          | Byte(s)                              | Range                     |
|-----------------|--------------------------------------|---------------------------|
| positive fixint | `0xxxxxxx` (1 byte, `0x00`–`0x7f`)  | 0 to 127                  |
| negative fixint | `111xxxxx` (1 byte, `0xe0`–`0xff`)  | -32 to -1                 |
| uint 8          | `0xcc` + 1 byte                      | 0 to 255                  |
| uint 16         | `0xcd` + 2 bytes (big-endian)        | 0 to 65535                |
| uint 32         | `0xce` + 4 bytes (big-endian)        | 0 to 2^32−1               |
| uint 64         | `0xcf` + 8 bytes (big-endian)        | 0 to 2^64−1               |
| int 8           | `0xd0` + 1 byte                      | -128 to 127               |
| int 16          | `0xd1` + 2 bytes (big-endian)        | -32768 to 32767           |
| int 32          | `0xd2` + 4 bytes (big-endian)        | -2147483648 to 2147483647 |
| int 64          | `0xd3` + 8 bytes (big-endian)        | -2^63 to 2^63−1           |

**The spec mandates minimal encoding:** an encoder MUST use the smallest format that can represent the value. Using `int 32` (`0xd2`) to encode `0` when `positive fixint` (`0x00`) is available is a spec violation.

---

## How This Was Discovered

These bugs were discovered while building a trading bot in V that communicates with the [Hyperliquid](https://hyperliquid.xyz) perpetuals exchange. Hyperliquid requires order actions to be serialized with MessagePack before being hashed for EIP-712 Ethereum signing:

```
order_bytes = msgpack.encode(action)
action_hash = keccak256(order_bytes + nonce_bytes + vault_byte)
signature   = secp256k1_sign(eip712_hash(action_hash))
```

The exchange's server re-serializes the same action with a spec-compliant library and computes the same hash independently. If the bytes differ — even by a single byte in a single field — the hashes differ and the signature is invalid. Once the `builder` field (`f: 0`, a zero integer) was present in the serialized action, the mismatch became apparent: V encoded `0` as `d2 00 00 00 00` (5 bytes) while the server expected `00` (1 byte), producing a completely different hash and an HTTP 422 rejection on every request.

The same failure will occur with any V program communicating with Python's `msgpack`, Rust's `rmp`/`rmp-serde`, Go's `vmihailenco/msgpack`, or any other spec-compliant implementation.

---

### Reproduction Steps

## Bug 1: Encoder — Fixed-Width Instead of Minimal Encoding

**File:** `encode.v`

### Reproduction Steps

```v
import msgpack
import encoding.hex

fn main() {
    // Encode a plain integer
    result := msgpack.encode(0)
    println(result.hex())  // prints: d200000000
                           // should print: 00

    result2 := msgpack.encode(42)
    println(result2.hex()) // prints: d20000002a
                           // should print: 2a

    // Encode a struct with an integer field
    result3 := msgpack.encode(struct{ age int }{age: 30})
    println(result3.hex()) // field value encoded as int32 (d2 00 00 00 1e)
                           // should be positive fixint (1e)
}
```

### Current Behaviour

The `encode[T]()` generic function dispatches on V's compile-time type and calls a fixed-width helper directly, completely bypassing the existing `encode_int` and `encode_uint` functions that already implement minimal encoding:

```v
// Current (buggy) code in encode.v
$else $if T is i8 {
    e.encode_i8(data)    // always writes 0xd0 + 1 byte (2 bytes total), even for value 0
} $else $if T is i16 {
    e.encode_i16(data)   // always writes 0xd1 + 2 bytes (3 bytes total)
} $else $if T is int {
    e.encode_i32(data)   // always writes 0xd2 + 4 bytes (5 bytes total)
} $else $if T is i64 {
    e.encode_i64(data)   // always writes 0xd3 + 8 bytes (9 bytes total)
} $else $if T is u8 {
    e.encode_u8(data)    // always writes 0xcc + 1 byte (2 bytes total)
} $else $if T is u16 {
    e.encode_u16(data)   // always writes 0xcd + 2 bytes (3 bytes total)
} $else $if T is u32 {
    e.encode_u32(data)   // always writes 0xce + 4 bytes (5 bytes total)
} $else $if T is u64 {
    e.encode_u64(data)   // always writes 0xcf + 8 bytes (9 bytes total)
}
```

Actual byte output:

```
encode(0)    → d2 00 00 00 00  (5 bytes: int32 format)
encode(42)   → d2 00 00 00 2a  (5 bytes: int32 format)
encode(-123) → d2 ff ff ff 85  (5 bytes: int32 format)
encode(1)    → d2 00 00 00 01  (5 bytes: int32 format)
```

Note also that `i32` has no case at all — it silently falls through to an unhandled branch.

### Expected Behaviour

Per the MessagePack specification, the encoder MUST select the smallest format that can represent the value:

```
encode(0)    → 00              (1 byte:  positive fixint)
encode(42)   → 2a              (1 byte:  positive fixint)
encode(-123) → d0 85           (2 bytes: int8 format)
encode(1)    → 01              (1 byte:  positive fixint)
```

The `encode_int` and `encode_uint` functions in the same file already implement this logic correctly — they just are not being called.

### Possible Solution

Route all integer types through `encode_int` or `encode_uint`, which already select the minimal format, and add the missing `i32` case:

```v
$else $if T is i8 {
    e.encode_int(i64(data))
} $else $if T is i16 {
    e.encode_int(i64(data))
} $else $if T is int {
    e.encode_int(i64(data))
} $else $if T is i32 {
    e.encode_int(i64(data))   // previously missing case
} $else $if T is i64 {
    e.encode_int(data)
} $else $if T is u8 {
    e.encode_uint(u64(data))
} $else $if T is u16 {
    e.encode_uint(u64(data))
} $else $if T is u32 {
    e.encode_uint(u64(data))
} $else $if T is u64 {
    e.encode_uint(data)
}
```

The `// TODO` comment acknowledging this fix should be removed at the same time.

---

## Bug 2: Config — `positive_int_unsigned` Defaults to `false`

**File:** `config.v`

### Reproduction Steps

```v
import msgpack

fn main() {
    // Value in the range 128-255 (fits in u8, not in i8)
    result := msgpack.encode(200)
    println(result.hex())  // prints: d100c8  (3 bytes: signed int16)
                           // should print: ccc8  (2 bytes: unsigned uint8)
}
```

### Current Behaviour

The `default_config()` function returns `positive_int_unsigned: false`. The `encode_int` function has a branch that checks this flag:

```v
pub fn (mut e Encoder) encode_int(i i64) {
    if e.config.positive_int_unsigned && i >= 0 {
        e.encode_uint(u64(i))  // compact unsigned path — NOT TAKEN by default
    } else if i > max_i8 {    // 127
        if i <= max_i16 {      // 32767
            e.encode_i16(i16(i))  // 200 ends up here: 3 bytes as signed int16
        }
        ...
    }
}
```

With the flag `false`, a non-negative value of `200` is larger than `max_i8` (127), so it takes the signed path and is encoded as a 3-byte signed int16 (`d1 00 c8`) even though it fits in a 2-byte unsigned uint8 (`cc c8`). Values 0–127 happen to encode correctly by coincidence (they fall into the negative fixint check `i >= -32` and write a single byte), masking the bug for the most common values.

### Expected Behaviour

Any non-negative integer should be encoded using the unsigned format family, which is always more compact than or equal in size to the signed family for non-negative values. The value `200` fits in `uint 8` and should produce 2 bytes: `cc c8`.

### Possible Solution

Enable `positive_int_unsigned` in the default config:

```v
pub fn default_config() Config {
    return Config{
        write_ext:             true
        positive_int_unsigned: true   // non-negative integers always use unsigned format
    }
}
```

---

## Bug 3: Decoder — Fixint Bytes Not Handled

**File:** `decode.v`

### Reproduction Steps

```v
import msgpack

fn main() {
    // Encode with any compliant library, or just construct the bytes manually.
    // Positive fixint for value 42 is a single byte: 0x2a
    data := [u8(0x2a)]

    mut val := 0
    mut decoder := msgpack.new_decoder()
    decoder.decode(data, mut val) or { println('error: ${err}') }
    // Prints: error (or panics) — should print 42

    // Same problem for negative fixint: -5 is 0xfb
    data2 := [u8(0xfb)]
    mut val2 := 0
    decoder.decode(data2, mut val2) or { println('error: ${err}') }
    // Prints: error — should print -5
}
```

This also means that after Bug 1 is fixed, V cannot even decode its own output: `msgpack.encode(0)` produces `[0x00]`, and `decode_integer` has no handler for `0x00`.

### Expected Behavior

### Expected Behaviour

For a positive fixint byte (`0x00`–`0x7f`), the format byte itself IS the integer value — no additional bytes follow. For a negative fixint byte (`0xe0`–`0xff`), the format byte reinterpreted as a signed `i8` IS the value. Both ranges must be handled.

```
decode([0x00]) → 0
decode([0x2a]) → 42
decode([0x7f]) → 127
decode([0xe0]) → -32
decode([0xfb]) → -5
decode([0xff]) → -1
```

### Current Behavior

### Current Behaviour

Both `decode_integer` and `decode_to_json` use a `match` on the format byte `d.bd`. Neither has arms for the positive fixint range (`0x00`–`0x7f`) or the negative fixint range (`0xe0`–`0xff`):

```v
// Current (buggy) decode_integer
pub fn (mut d Decoder) decode_integer[T](mut val T) ! {
    data := d.buffer
    match d.bd {
        // 0x00–0x7f: positive fixint — NO HANDLER, falls to else/error
        // 0xe0–0xff: negative fixint — NO HANDLER, falls to else/error
        mp_u8 { val = data[d.pos]; d.pos++ }
        mp_u16 { ... }
        // ...
    }
}

// Current (buggy) decode_to_json
match d.bd {
    // 0x00–0x7f: positive fixint — NO HANDLER
    // 0xe0–0xff: negative fixint — NO HANDLER
    mp_u8, mp_u16, mp_u32, mp_u64, mp_i8, mp_i16, mp_i32, mp_i64 {
        ...
    }
}
```

Any integer value in the range -32 to 127, when encoded by a compliant library (including the fixed V encoder), produces a single-byte fixint. That byte is completely unrecognised by V's decoder.

### Possible Solution

### Possible Solution

Add `match` arms for both fixint ranges in both `decode_integer` and `decode_to_json`:

```v
// Fixed decode_integer
pub fn (mut d Decoder) decode_integer[T](mut val T) ! {
    data := d.buffer
    match d.bd {
        mp_pos_fix_int_min...mp_pos_fix_int_max {
            val = d.bd          // format byte IS the value (0–127), no extra bytes
        }
        mp_neg_fix_int_min...mp_neg_fix_int_max {
            val = i8(d.bd)      // format byte reinterpreted as signed (-32 to -1)
        }
        mp_u8 { val = data[d.pos]; d.pos++ }
        // ... rest unchanged
    }
}

// Fixed decode_to_json
match d.bd {
    mp_pos_fix_int_min...mp_pos_fix_int_max {
        int_val := int(d.bd)
        unsafe { result.push_many(int_val.str().str, int_val.str().len) }
    }
    mp_neg_fix_int_min...mp_neg_fix_int_max {
        int_val := int(i8(d.bd))
        unsafe { result.push_many(int_val.str().str, int_val.str().len) }
    }
    mp_u8, mp_u16, mp_u32, mp_u64, mp_i8, mp_i16, mp_i32, mp_i64 {
        // ... unchanged
    }
}
```

---

## Test Expectations Must Be Updated

### File: `encode_test.v`

The existing test assertions were written to match the buggy output. They must be corrected:

```v
// BEFORE — asserting the buggy, non-spec-compliant output
assert msgpack.encode(0)          == hex.decode('d200000000')! // int32: 5 bytes
assert msgpack.encode(42)         == hex.decode('d20000002a')! // int32: 5 bytes
assert msgpack.encode(-123)       == hex.decode('d2ffffff85')! // int32: 5 bytes
assert msgpack.encode([0])        == hex.decode('91d200000000')!
assert msgpack.encode([1, 2, 3])  == hex.decode('93d200000001d200000002d200000003')!
assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea162d20000001e')!

// AFTER — asserting correct, spec-compliant minimal encoding
assert msgpack.encode(0)          == hex.decode('00')!          // positive fixint: 1 byte
assert msgpack.encode(42)         == hex.decode('2a')!          // positive fixint: 1 byte
assert msgpack.encode(-123)       == hex.decode('d085')!        // int8: 2 bytes
assert msgpack.encode([0])        == hex.decode('9100')!        // array + fixint
assert msgpack.encode([1, 2, 3])  == hex.decode('93010203')!    // 3× fixint
assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea1621e')!
```

---

## Complete Diff

### `config.v`
```diff
 pub fn default_config() Config {
 	return Config{
-		write_ext: true
+		write_ext:             true
+		positive_int_unsigned: true
 	}
 }
```

### `encode.v`
```diff
-	// TODO: if int encode_int, if uint encode_uint
-	// instead of needing to check each type, also
-	// then we will be using the smallest storage
 	$else $if T is i8 {
-		e.encode_i8(data)
+		e.encode_int(i64(data))
 	} $else $if T is i16 {
-		e.encode_i16(data)
+		e.encode_int(i64(data))
 	} $else $if T is int {
-		e.encode_i32(data)
+		e.encode_int(i64(data))
+	} $else $if T is i32 {
+		e.encode_int(i64(data))
 	} $else $if T is i64 {
-		e.encode_i64(data)
+		e.encode_int(data)
 	} $else $if T is u8 {
-		e.encode_u8(data)
+		e.encode_uint(u64(data))
 	} $else $if T is u16 {
-		e.encode_u16(data)
+		e.encode_uint(u64(data))
 	} $else $if T is u32 {
-		e.encode_u32(data)
+		e.encode_uint(u64(data))
 	} $else $if T is u64 {
-		e.encode_u64(data)
+		e.encode_uint(data)
 	}
```

### `decode.v`
```diff
+		mp_pos_fix_int_min...mp_pos_fix_int_max {
+			int_val := int(d.bd)
+			unsafe { result.push_many(int_val.str().str, int_val.str().len) }
+		}
+		mp_neg_fix_int_min...mp_neg_fix_int_max {
+			int_val := int(i8(d.bd))
+			unsafe { result.push_many(int_val.str().str, int_val.str().len) }
+		}
 		mp_u8, mp_u16, mp_u32, mp_u64, mp_i8, mp_i16, mp_i32, mp_i64 {
```
```diff
 pub fn (mut d Decoder) decode_integer[T](mut val T) ! {
 	data := d.buffer
 	match d.bd {
+		mp_pos_fix_int_min...mp_pos_fix_int_max {
+			val = d.bd
+		}
+		mp_neg_fix_int_min...mp_neg_fix_int_max {
+			val = i8(d.bd)
+		}
 		mp_u8 {
```

### `encode_test.v`
```diff
-	assert msgpack.encode(0) == hex.decode('d200000000')!
-	assert msgpack.encode(42) == hex.decode('d20000002a')!
-	assert msgpack.encode(-123) == hex.decode('d2ffffff85')!
+	assert msgpack.encode(0) == hex.decode('00')!
+	assert msgpack.encode(42) == hex.decode('2a')!
+	assert msgpack.encode(-123) == hex.decode('d085')!

-	assert msgpack.encode([0]) == hex.decode('91d200000000')!
+	assert msgpack.encode([0]) == hex.decode('9100')!
-	assert msgpack.encode([1, 2, 3]) == hex.decode('93d200000001d200000002d200000003')! // REVIEW
+	assert msgpack.encode([1, 2, 3]) == hex.decode('93010203')!

-	assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea162d20000001e')!
+	assert msgpack.encode(Struct{'John', 30}) == hex.decode('82a161a44a6f686ea1621e')!
```

---

## Verification

After applying all fixes, all three existing test suites pass with no modifications beyond the corrected assertions in `encode_test.v`:

```
---- Testing... ----------------------------------------------------------------
OK    [1/3]  decode_test.v
OK    [2/3]  encode_test.v
OK    [3/3]  decode_to_json_test.v
--------------------------------------------------------------------------------
Summary: 3 passed, 3 total.
```

### Cross-Language Validation

The fixed V output matches Python's `msgpack` library byte-for-byte:

```python
import msgpack
assert msgpack.packb(0)        == b'\x00'
assert msgpack.packb(42)       == b'\x2a'
assert msgpack.packb(-123)     == b'\xd0\x85'
assert msgpack.packb([0])      == b'\x91\x00'
assert msgpack.packb([1,2,3])  == b'\x93\x01\x02\x03'
```

---

## References

- MessagePack specification: https://github.com/msgpack/msgpack/blob/master/spec.md#int-format-family
- Original `// TODO` comment in `encode.v`: commit `55d09a0` (and prior)
- Python reference implementation: https://github.com/msgpack/msgpack-python


### Additional Information/Context

_No response_

### V version

0.5

### Environment details (OS name and version, etc.)

CachyOS
> [!NOTE]
> You can use the 👍 reaction to increase the issue's priority for developers.
>
> Please note that only the 👍 reaction to the issue itself counts as a vote.
> Other reactions and those to comments will not be taken into account.

Format	Byte(s)	Range
positive fixint	`0xxxxxxx` (1 byte, `0x00`–`0x7f`)	0 to 127
negative fixint	`111xxxxx` (1 byte, `0xe0`–`0xff`)	-32 to -1
uint 8	`0xcc` + 1 byte	0 to 255
uint 16	`0xcd` + 2 bytes (big-endian)	0 to 65535
uint 32	`0xce` + 4 bytes (big-endian)	0 to 2^32−1
uint 64	`0xcf` + 8 bytes (big-endian)	0 to 2^64−1
int 8	`0xd0` + 1 byte	-128 to 127
int 16	`0xd1` + 2 bytes (big-endian)	-32768 to 32767
int 32	`0xd2` + 4 bytes (big-endian)	-2147483648 to 2147483647
int 64	`0xd3` + 8 bytes (big-endian)	-2^63 to 2^63−1

Uh oh!

msgpack + encode/decode issues and my fix (3 bugs) #26644

Description

Describe the bug

Bug Report: vlang/msgpack — Minimal Integer Encoding & Fixint Decoding

Summary

Background: The MessagePack Integer Format Family

How This Was Discovered

Reproduction Steps

Bug 1: Encoder — Fixed-Width Instead of Minimal Encoding

Reproduction Steps

Current Behaviour

Expected Behaviour

Possible Solution

Bug 2: Config — positive_int_unsigned Defaults to false

Reproduction Steps

Current Behaviour

Expected Behaviour

Possible Solution

Bug 3: Decoder — Fixint Bytes Not Handled

Reproduction Steps

Expected Behavior

Expected Behaviour

Current Behavior

Current Behaviour

Possible Solution

Possible Solution

Test Expectations Must Be Updated

File: encode_test.v

Complete Diff

config.v

encode.v

decode.v

encode_test.v

Verification

Cross-Language Validation

References

Additional Information/Context

V version

Environment details (OS name and version, etc.)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Bug Report: `vlang/msgpack` — Minimal Integer Encoding & Fixint Decoding

Bug 2: Config — `positive_int_unsigned` Defaults to `false`

File: `encode_test.v`

`config.v`

`encode.v`

`decode.v`

`encode_test.v`