Skip to content

Conversation

@huan233usc
Copy link
Collaborator

@huan233usc huan233usc commented Jan 5, 2026

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

How was this patch tested?

Does this PR introduce any user-facing changes?

…t useMetadataRowIndex control

This PR refactors DeltaParquetFileFormatBase to allow the V2 connector to explicitly
control whether to use Parquet's _metadata.row_index for deletion vector filtering.

Changes:
- Add useMetadataRowIndexOpt parameter to DeltaParquetFileFormatBase
- Update DeltaParquetFileFormatV2 with new constructor accepting useMetadataRowIndex
- V1 connector behavior unchanged (reads from config when None)

This refactor enables:
- Phase 1 (DV basic read): Pass false to disable file splitting
- Phase 3 (DV with splitting): Pass true to use _metadata.row_index
@huan233usc huan233usc force-pushed the stack/dv_pr1_refactor_parquet_format branch from 4143b73 to 187b123 Compare January 5, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant