-
Notifications
You must be signed in to change notification settings - Fork 260
feat: Improve fetch partition performance, support skip validation arrow ipc files #1216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thanks @westhide this PR makes sense, It also brings configurable validation option, which I did not really think about.
|
Q1: Q2: AdditionShould we consider the power down scene? It may cause Arrow ipc file broken if |
I guess we can safely assume that only shuffle files are accessed, not a random files.
what I was having in mind is enabling unsafe (without validation) by default but having a executor configuration switch which could revert this. It might be easy to cover case with arrow flight but a bit harder when ShuffleReader reads local shuffle file
Im not sure this case can be achieved with this validation? how can we be sure that file is written fully? I guess we'd need some kind of checksum for this scenario. |
All right, will move the validation config to Executor. And will try to add a checksum validation with failed Job rerun after finish the wasm udf. |
eeaab58 to
972dbd0
Compare
|
hey @westhide are you still interested to get this PR merged? |
|
moving to draft as its waiting for changes |
|
As far as I can see, we don't have to validate the IPC files:
I think it makes sense to support job recovery either by stage or by completed task rather than per IPC file. |
|
Is there anything left to do apart from the config based setup on the executor for this PR? |
|
yes we can add it, please do @killzoner |
|
maybe as a follow up, #1316 (comment) point 1, 2 would complement this work |
That's clear, thank you |
Which issue does this PR close?
Closes #1189.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?