Skip to content

MONGOCRYPT-922 Make padding strings unique#1182

Draft
kevinAlbs wants to merge 3 commits into
mongodb:masterfrom
kevinAlbs:padding-repeats.1
Draft

MONGOCRYPT-922 Make padding strings unique#1182
kevinAlbs wants to merge 3 commits into
mongodb:masterfrom
kevinAlbs:padding-repeats.1

Conversation

@kevinAlbs
Copy link
Copy Markdown
Collaborator

@kevinAlbs kevinAlbs commented May 26, 2026

Summary

Make padding strings unique.

Background & Motivation

See report in MONGOCRYPT-913 for details. This addresses finding with ID 1.

Quoting https://docs.google.com/document/d/1KhCFiofunsk7VeVB4VftvxFd9rQQOZrBQaTqHCljHtU/edit?tab=t.0:

appended with a 0xFF byte, which makes it an invalid UTF-8 sequence

This PR proposes appending a four byte counter after the 0xFF to ensure all padding strings are unique.

Testing

Appending an additional four bytes resulted in needing to extend the test data in RNG_DATA.h in substring tests to avoid the error "Out of random data".

To additionally verify changes, the QE Text driver spec and prose tests were run locally with the C driver. The only necessary changes in tests were to update the exoected __safeContent__ (which I expect is a direct result of updating the padding string contents).

@kevinAlbs kevinAlbs force-pushed the padding-repeats.1 branch from df6de4b to c2acc24 Compare May 26, 2026 13:01
kevinAlbs added 3 commits May 26, 2026 09:12
Gets test passing to ensure padding tokens are obtained from unique strings
@kevinAlbs kevinAlbs force-pushed the padding-repeats.1 branch from c2acc24 to 31adbe6 Compare May 26, 2026 13:13
@kevinAlbs kevinAlbs requested review from erwee and mdb-ad May 26, 2026 14:22
@kevinAlbs kevinAlbs marked this pull request as ready for review May 26, 2026 14:23
@kevinAlbs kevinAlbs requested a review from a team as a code owner May 26, 2026 14:23
Copy link
Copy Markdown
Collaborator

@erwee erwee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this with the crypto team. We'll be disputing the finding that an adversary can tell the real affix length after the document has been inserted (They shouldn't because each of the "fake" padding tokens have already been mapped to unique safeContent tags). So I don't think this change will be needed (perhaps not yet).

@kevinAlbs
Copy link
Copy Markdown
Collaborator Author

adversary can tell the real affix length

Does QE Text intend to protect against an adversary intercepting traffic to the server? I expect "yes". And the FLE2InsertUpdatePayloadV2 sent from the client could be decoded to get the insert length. See decode_length.py for an example printing the length by decoding the old insert-prefix.json payload.

@kevinAlbs kevinAlbs requested a review from erwee May 27, 2026 15:23
Copy link
Copy Markdown
Collaborator

@erwee erwee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this change would also require a server-side change because the server depends on the padding tokens being identical and being adjacent to each other. Besides that, it would also prompt a change in the OST paper to account for the extra counter bytes to disambiguate the padding strings. As for this latter point, I'll have to discuss with Seny & Tarik first.

@kevinAlbs kevinAlbs marked this pull request as draft May 27, 2026 19:03
@kevinAlbs kevinAlbs removed the request for review from mdb-ad May 27, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants