Skip to content

properly scope table references when binding#5673

Draft
pedrocarlo wants to merge 5 commits intotursodatabase:mainfrom
pedrocarlo:scoping-col
Draft

properly scope table references when binding#5673
pedrocarlo wants to merge 5 commits intotursodatabase:mainfrom
pedrocarlo:scoping-col

Conversation

@pedrocarlo
Copy link
Collaborator

Description

This PR replaces ad-hoc column name binding in core/translate/expr.rs with a composable trait-based scope system.

The previous binder used flat behavior flags and direct iteration over joined_tables + outer_query_refs, which made clause-specific scoping fragile. This refactor introduces explicit scope composition at each call site so each SQL clause can define its own name-resolution precedence and visibility.

Design

New module

  • Added core/translate/scope.rs with:
    • Scope trait
    • ResolvedColumn and LookupResult
    • Scope implementations:
      • FullTableScope (joined tables, then outer refs)
      • LocalTableScope (joined tables only)
      • AliasScope (result-column alias expansion)
      • EmptyScope (resolves nothing)
    • ChainedScope<I, O> combinator for precedence layering
    • Extracted lookup helpers for joined/outer + qualified/unqualified resolution

Binder API change

  • bind_and_rewrite_expr signature changed to:
pub fn bind_and_rewrite_expr(
    expr: &mut ast::Expr,
    scope: &mut impl Scope,
    resolver: &Resolver<'_>,
    tolerate_unbound: bool,
) -> Result<()>
  • Removed BindingBehavior.
  • Expr::Id and Expr::Qualified now delegate to Scope.
  • Kept Expr::DoublyQualified and Expr::FunctionCallStar as binder special-cases (using scope.table_references_mut() when needed).

Scoping behavior changes

  • GROUP BY / HAVING / ORDER BY alias chains now use AliasScope -> LocalTableScope where appropriate, preventing accidental outer-reference capture.
  • WHERE and general expression contexts continue to use full table visibility via FullTableScope (with optional alias fallback depending on clause semantics).

Motivation and context

I caught some scoping issues with differential fuzzer where a GROUP BY in an inner query was resolving a column from an outer query that had the same name but was from a different table. So I wanted to try and solve this once and for all. Look at the .sqltest for an example

Description of AI Usage

Planned with Claude and executed by Codex

@jussisaurio
Copy link
Collaborator

glad youre making something like this. will read maybe tomorrow. i think we should try to solve all of our scoped binding problems in one go, eg. allowing correlated subs in all the positions that we've currently disabled due to scope issues

@pedrocarlo
Copy link
Collaborator Author

yeah, I still need to read bit more of the PR to see if I'm actually solving all scoping problems. I need some more sqltests to prove this. As we don't have this "spec" written somewhere of all the scoping behaviours we expect, the AI and me cannot be 100% sure we did the right thing.
Like what should be correct behaviour for a uncorrelated Subquery in Order By position, that is itself inside a correlated subquery. Stuff like this

@pedrocarlo pedrocarlo changed the title Scoping col properly scope table references when binding Feb 28, 2026
@pedrocarlo
Copy link
Collaborator Author

Ok I'm still brainstorming a bit how we can try and solve this correctly. I'm also looking into datafusion and seeing if we can get some more interesting ideas from them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants