Refactor data block footer to reserve metadata bits for future features#14332
Refactor data block footer to reserve metadata bits for future features#14332pdillinger wants to merge 3 commits intofacebook:mainfrom
Conversation
Summary: Refactor the data block footer encoding/decoding to use a struct-based Encode/Decode API (DataBlockFooter), reserving the top 4 bits of the footer for metadata: - Bit 31: Hash index present (kDataBlockBinaryAndHash) - existing use - Bits 28-30: Reserved for future features Comments have some detail for why it is safe to assume no practical existing SST files would use these newly reserved bits. And for forward compatibility, existing versions detect (non-zero) use of these new bits as impossibly large num_restarts and report "bad block contents". Not perfect, but not bad. Key changes: - Replace PackIndexTypeAndNumRestarts/UnPackIndexTypeAndNumRestarts with DataBlockFooter::EncodeTo/DecodeFrom methods - DecodeFrom returns a detailed error when reserved bits are set, enabling graceful failure on newer format versions - Reduce kMaxNumRestarts from 2^31-1 to 2^28-1 (268M), which is adequate for the maximum possible restarts in a 4GiB block - Add GetCorruptionStatus() to Block for detailed error messages (Note that we are sensitive to the size of Block objects, so have to avoid adding unnecessary new members.) - Remove obsolete kMaxBlockSizeSupportedByHashIndex size checks Test Plan: - Existing unit tests and format compatibility test - Add test for reserved bit detection (ReservedBitInDataBlockFooter)
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D93293152. |
table/block_based/block.cc
Outdated
| return s; // Return the detailed error from DecodeFrom | ||
| } | ||
| // Footer decoded OK, so error was in later processing (shouldn't happen) | ||
| assert(!s.ok()); |
There was a problem hiding this comment.
is this assertion needed? the above if statement returns on !s.ok()
There was a problem hiding this comment.
For better grouping of crash test failures, we try to avoid "assert(false)" or similar generic failure modes. I could improve this one further, but it might get excessive if we explain that on every assertion that is intended to fail if it's reached. We should probably code up a custom failure API for this.
table/block_based/block.cc
Outdated
| TEST_SYNC_POINT("Block::Block:0"); | ||
| auto& size = contents_.data.size_; | ||
| if (size < sizeof(uint32_t)) { | ||
| if (size < DataBlockFooter::kMinEncodedLength) { |
There was a problem hiding this comment.
is this if necessary anymore if we are checking for the size in DecodeFrom
There was a problem hiding this comment.
Good catch I can fix that.
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D93293152. |
|
@pdillinger merged this pull request in 641f470. |
Summary: I'm implementing this intending it to be used for #14287
Refactor the data block footer encoding/decoding to use a struct-based Encode/Decode API (DataBlockFooter), reserving the top 4 bits of the footer for metadata:
Comments have some detail for why it is safe to assume no practical existing SST files would use these newly reserved bits. And for forward compatibility, existing versions detect (non-zero) use of these new bits as impossibly large num_restarts and report "bad block contents". Not perfect, but not bad.
Key changes:
Test Plan: