-
Notifications
You must be signed in to change notification settings - Fork 260
feat: Scheduler supports substrait logical plan and remove deprecated sql support
#1360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Scheduler supports substrait logical plan and remove deprecated sql support
#1360
Conversation
|
thanks @mattcuento, would you be interested in adding substrait as well as part of this PR? |
Sure thing! |
7119f99 to
887096b
Compare
Sql field from ExecuteQueryParams.Querysubstrait_plan and remove deprecated sql field from ExecuteQueryParams.Query
|
Hey @milenkovicm, I added support for Substrait on Assuming that's true, it was a little tricky to test this well aside from calling the scheduler RPC service directly ( |
milenkovicm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think change make sense, and test as well, thanks @mattcuento
there is one issue with docker build and looks like issue with disk space failing other (not sure how to fix)
|
Maybe as a follow up we should put a bit more documentation around this and example(s) |
|
Also, could we gate substrait with config option, which could be on by default? |
1816b45 to
f75d42e
Compare
Thanks, looks like From this issue I gathered that we could compile
Agreed, happy to file an issue to track it. I'm still kind of curious if more changes would be desired for user ergonomics. Do we have existing examples of connecting to a scheduler besides the client? I'd imagine it might be useful to add some convenience/wrapper methods to create scheduler gRPC clients, such as combining I don't know much about Ibis, but will take a look to see how it would integrate with this.
Done! Updated the PR description, added support/conditional dependencies under |
milenkovicm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mattcuento changes make sense, we just need to fix build issue and we can merge this
c68a693 to
ed838e9
Compare
reserving field id 2 and field name sql
|
not sure why it does not start github actions, maybe you should rebase |
1697215 to
b04b37b
Compare
|
Will review the latest |
|
perhaps, action getting out of disk space ? looks like linker freaking out, which should not be related to your change |
b04b37b to
12c2c61
Compare
|
Ok it does seem that the issues for You can monitor disk space with I've tried a bunch of different tricks to free space (actions/runner-images#2840 for example) but no luck. We run tests in a container so we don't get access to the root file system to clean much up. We could alternatively run tests outside of a container which allows us more freedom to delete things. This is what space looks like prior to building/running tests: I also don't know what kind of plan apache repos are on. Could we potentially just use a larger runner class? |
12c2c61 to
c6b02f8
Compare
695ff88 to
3879489
Compare
|
would |
|
milenkovicm@b60d245 adding |
c6b02f8 to
8706cd3
Compare
|
I couldn't replicate the error on my fork, but add this to the CI section in reduce the ./target directory size by ~10%. I'm not sure how |
|
I have tested with |
72a8639 to
72b3db6
Compare
|
Interesting, @milenkovicm, I tried However, I did just try @Huy1Ng suggestion for the CI profile and that seemed to work. I've cleaned up some commit history now, hopefully should be ready to merge. Thank you both for the suggestions here 🙂 |
milenkovicm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mattcuento
one minor comment
| .sql(&sql) | ||
| .await | ||
| .and_then(|df| df.into_optimized_plan()) | ||
| Query::SubstraitPlan(bytes) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: unused variable: `bytes`
--> ballista/scheduler/src/scheduler_server/grpc.rs:401:38
|
401 | Query::SubstraitPlan(bytes) => {
| ^^^^^ help: if this is intentional, prefix it with an underscore: `_bytes`
|
= note: `#[warn(unused_variables)]` (part of `#[warn(unused)]`) on by default
if substrait disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed!
Linting + using non-reserved field id Adding test for substrait compatibility with local scheduler instance add inline expressio to substrait test protect substrait support under feature Adding protoc feature to datafusion-substrait package Removing protoc feature Bump rust base version Rebuild Cargo.lock spacing spacing Add cargo clean before test remove cargo clean Fix linting for feature off
72b3db6 to
c66b160
Compare
substrait_plan and remove deprecated sql field from ExecuteQueryParams.Querysql support
sql supportsql support
sql supportsubstrait logical plan and remove deprecated sql support
|
great effort, thanks a lot @mattcuento |

Which issue does this PR close?
Closes #1358.
Part of #32.
Rationale for this change
As part of this epic, adding server-side support for Substrait plans (to the Scheduler
ExecuteQueryAPI). Substrait support is gated by thesubstraitfeature with conditional depdencies fordatafusion-substraitand thereforesubstrait. This feature is on by default.The
sqlfield was marked as deprecated from the Ballista protocol two years back here. This is clean up of the field. The field ID and name have been marked as reserved to prevent corrupted requests if client/server versions drift across this change.What changes are included in this PR?
rust:1.92-trixiein order to pull an updatedprotobuf-compilerversion that supports optional fields for proto3 by default.Are there any user-facing changes?