Skip to content

Comments

Refactor data block footer to reserve metadata bits for future features#14332

Closed
pdillinger wants to merge 3 commits intofacebook:mainfrom
pdillinger:data_block_footer_refactor
Closed

Refactor data block footer to reserve metadata bits for future features#14332
pdillinger wants to merge 3 commits intofacebook:mainfrom
pdillinger:data_block_footer_refactor

Conversation

@pdillinger
Copy link
Contributor

Summary: I'm implementing this intending it to be used for #14287

Refactor the data block footer encoding/decoding to use a struct-based Encode/Decode API (DataBlockFooter), reserving the top 4 bits of the footer for metadata:

  • Bit 31: Hash index present (kDataBlockBinaryAndHash) - existing use
  • Bits 28-30: Reserved for future features

Comments have some detail for why it is safe to assume no practical existing SST files would use these newly reserved bits. And for forward compatibility, existing versions detect (non-zero) use of these new bits as impossibly large num_restarts and report "bad block contents". Not perfect, but not bad.

Key changes:

  • Replace PackIndexTypeAndNumRestarts/UnPackIndexTypeAndNumRestarts with DataBlockFooter::EncodeTo/DecodeFrom methods
  • DecodeFrom returns a detailed error when reserved bits are set, enabling graceful failure on newer format versions
  • Reduce kMaxNumRestarts from 2^31-1 to 2^28-1 (268M), which is adequate for the maximum possible restarts in a 4GiB block
  • Add GetCorruptionStatus() to Block for detailed error messages (Note that we are sensitive to the size of Block objects, so have to avoid adding unnecessary new members.)
  • Remove obsolete kMaxBlockSizeSupportedByHashIndex size checks

Test Plan:

  • Existing unit tests and format compatibility test
  • Add test for reserved bit detection (ReservedBitInDataBlockFooter)

Summary:
Refactor the data block footer encoding/decoding to use a struct-based
Encode/Decode API (DataBlockFooter), reserving the top 4 bits of the
footer for metadata:
- Bit 31: Hash index present (kDataBlockBinaryAndHash) - existing use
- Bits 28-30: Reserved for future features

Comments have some detail for why it is safe to assume no practical
existing SST files would use these newly reserved bits. And for forward
compatibility, existing versions detect (non-zero) use of these new bits
as impossibly large num_restarts and report "bad block contents". Not
perfect, but not bad.

Key changes:
- Replace PackIndexTypeAndNumRestarts/UnPackIndexTypeAndNumRestarts with
  DataBlockFooter::EncodeTo/DecodeFrom methods
- DecodeFrom returns a detailed error when reserved bits are set,
  enabling graceful failure on newer format versions
- Reduce kMaxNumRestarts from 2^31-1 to 2^28-1 (268M), which is adequate
  for the maximum possible restarts in a 4GiB block
- Add GetCorruptionStatus() to Block for detailed error messages
  (Note that we are sensitive to the size of Block objects, so have to
  avoid adding unnecessary new members.)
- Remove obsolete kMaxBlockSizeSupportedByHashIndex size checks

Test Plan:
- Existing unit tests and format compatibility test
- Add test for reserved bit detection (ReservedBitInDataBlockFooter)
@meta-codesync
Copy link

meta-codesync bot commented Feb 13, 2026

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D93293152.

return s; // Return the detailed error from DecodeFrom
}
// Footer decoded OK, so error was in later processing (shouldn't happen)
assert(!s.ok());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this assertion needed? the above if statement returns on !s.ok()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better grouping of crash test failures, we try to avoid "assert(false)" or similar generic failure modes. I could improve this one further, but it might get excessive if we explain that on every assertion that is intended to fail if it's reached. We should probably code up a custom failure API for this.

TEST_SYNC_POINT("Block::Block:0");
auto& size = contents_.data.size_;
if (size < sizeof(uint32_t)) {
if (size < DataBlockFooter::kMinEncodedLength) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this if necessary anymore if we are checking for the size in DecodeFrom

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch I can fix that.

@meta-codesync
Copy link

meta-codesync bot commented Feb 17, 2026

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D93293152.

@meta-codesync meta-codesync bot closed this in 641f470 Feb 18, 2026
@meta-codesync
Copy link

meta-codesync bot commented Feb 18, 2026

@pdillinger merged this pull request in 641f470.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants