feat(duckdb)!: support transpilation of function LEFT from Snowflake … by fivetran-ashashankar · Pull Request #7419 · tobymao/sqlglot

fivetran-ashashankar · 2026-03-30T22:40:27Z

…to DuckDB

Add left_sql() method to handle BINARY/BLOB arguments
Convert BLOB to HEX string, multiply length by 2, apply LEFT, convert back with UNHEX
For VARCHAR arguments, use native DuckDB LEFT function

sqlglot/generators/duckdb.py

geooo109 · 2026-03-31T09:36:43Z

@fivetran-ashashankar Let's also add a check on the query, if the second argument (length) is negative we have to return an empty string/blob.

memory D SELECT LEFT('HELLO', -2);
┌─────────────────────┐
│ "left"('HELLO', -2) │
│       varchar       │
├─────────────────────┤
│ HEL                 │
└─────────────────────┘
memory D SELECT RIGHT('HELLO', -2);
┌──────────────────────┐
│ "right"('HELLO', -2) │
│       varchar        │
├──────────────────────┤
│ LLO                  │

This ^ isn't covered for both RIGHT and LEFT. Snowflake for negative length doesn't return a substring both for string and blob inputs.

So, in this approach #7419 (comment) we can also add a check for the length argument.

The check should be added on top of the generated query. As we have already discussed we have to cover cases like the following:

WITH t AS (SELECT -2 AS l) SELECT LEFT(TO_BINARY('SNOWIKOPN', 'utf-8'), l) FROM t;

We don't know the value of l at generation time.

github-actions · 2026-03-31T19:24:21Z

SQLGlot Integration Test Results

Comparing:

this branch (sqlglot:RD-1147725-transpile-LEFT, sqlglot version: RD-1147725-transpile-LEFT)
baseline (main, sqlglot version: 0.0.1.dev1)

By Dialect

dialect	main	sqlglot:RD-1147725-transpile-LEFT	transitions	links
bigquery -> bigquery	23872/23877 passed (100.0%)	21252/21252 passed (100.0%)	No change	full result / delta
bigquery -> duckdb	2105/2624 passed (80.2%)	0/0 passed (0.0%)	No change	full result / delta
snowflake -> duckdb	1511/2674 passed (56.5%)	1511/2674 passed (56.5%)	No change	full result / delta
snowflake -> snowflake	65910/65910 passed (100.0%)	65910/65910 passed (100.0%)	No change	full result / delta
databricks -> databricks	1364/1364 passed (100.0%)	1364/1364 passed (100.0%)	No change	full result / delta
postgres -> postgres	6040/6040 passed (100.0%)	6040/6040 passed (100.0%)	No change	full result / delta
redshift -> redshift	7059/7059 passed (100.0%)	7059/7059 passed (100.0%)	No change	full result / delta

Overall

main: 109548 total, 107861 passed (pass rate: 98.5%), sqlglot version: 0.0.1.dev1

sqlglot:RD-1147725-transpile-LEFT: 108302 total, 107139 passed (pass rate: 98.9%), sqlglot version: RD-1147725-transpile-LEFT

Transitions:
No change

✅ 1 test(s) passed

fivetran-ashashankar · 2026-03-31T19:47:59Z

@fivetran-ashashankar Let's also add a check on the query, if the second argument (length) is negative we have to return an empty string/blob.
memory D SELECT LEFT('HELLO', -2);
┌─────────────────────┐
│ "left"('HELLO', -2) │
│       varchar       │
├─────────────────────┤
│ HEL                 │
└─────────────────────┘
memory D SELECT RIGHT('HELLO', -2);
┌──────────────────────┐
│ "right"('HELLO', -2) │
│       varchar        │
├──────────────────────┤
│ LLO                  │
This ^ isn't covered for both RIGHT and LEFT. Snowflake for negative length doesn't return a substring both for string and blob inputs.

So, in this approach #7419 (comment) we can also add a check for the length argument.

The check should be added on top of the generated query. As we have already discussed we have to cover cases like the following:
WITH t AS (SELECT -2 AS l) SELECT LEFT(TO_BINARY('SNOWIKOPN', 'utf-8'), l) FROM t;
We don't know the value of l at generation time.
added runtime checks for negative length values:
For VARCHAR: returns '' when length < 0
For BLOB: returns UNHEX('') when length < 0
Optimized to skip CASE for positive literals (e.g., LEFT('HELLO', 2) generates clean LEFT('HELLO', 2))
Adds CASE for all non-literals (columns, functions, expressions) since we can't determine the value at transpile time

WITH t AS (SELECT -2 AS l) SELECT LEFT(TO_BINARY('SNOWIKOPN', 'utf-8'), l) FROM t; transpiles to WITH t AS (SELECT -2 AS l) SELECT CASE WHEN l < 0 THEN UNHEX('') ELSE UNHEX(LEFT(HEX(ENCODE('SNOWIKOPN')), l * 2)) END FROM t;

identified l as a column, adds case and handles the blob convertion

sqlglot/generators/duckdb.py

…to DuckDB - Add left_sql() method to handle BINARY/BLOB arguments - Convert BLOB to HEX string, multiply length by 2, apply LEFT, convert back with UNHEX - For VARCHAR arguments, use native DuckDB LEFT function

…view comments incorporated.

…ew comments. existing test fixes.

…ew comments. exp.case used

geooo109 · 2026-04-01T17:50:44Z

sqlglot/generators/duckdb.py

+    def _left_right_sql(self, expression: exp.Left | exp.Right, func_name: str) -> str:
        arg = expression.this
        length = expression.expression

-        # For BINARY/BLOB: DuckDB doesn't support RIGHT on BLOB
-        # Convert to HEX string, use RIGHT, then convert back to BLOB
+        # Only add negative length handling for Snowflake-originated queries
+        needs_case = expression.args.get("negative_length_returns_empty")
+
+        # For BINARY/BLOB: DuckDB doesn't support LEFT/RIGHT on BLOB
+        # Convert to HEX string, use LEFT/RIGHT, then convert back to BLOB
        if _is_binary(arg):
-            # RIGHT(blob, n) becomes UNHEX(RIGHT(HEX(blob), n * 2))
+            # LEFT/RIGHT(blob, n) becomes UNHEX(LEFT/RIGHT(HEX(blob), n * 2))
            # Each byte becomes 2 hex chars, so multiply length by 2
            hex_arg = exp.Hex(this=arg)
            hex_length = exp.Mul(this=length, expression=exp.Literal.number(2))
-            # since this exp.Right is not annotated, it won't enter this _is_binary branch during the recursive call
-            hex_right = self.func("RIGHT", hex_arg, hex_length)
-            result = exp.Unhex(this=hex_right)
+            result = exp.Unhex(this=self.func(func_name, hex_arg, hex_length))
+
+            # Handle negative length: return empty blob
+            if needs_case:
+                empty_blob = exp.Unhex(this=exp.Literal.string(""))
+                case_expr = (
+                    exp.case().when(length < exp.Literal.number(0), empty_blob).else_(result)
+                )
+                return self.sql(case_expr)
            return self.sql(result)

-        # For VARCHAR: Use native RIGHT function
-        return self.func("RIGHT", arg, length)
+        # For VARCHAR: Use native LEFT/RIGHT function
+        # Handle negative length: return empty string
+        if needs_case:
+            func_expr = exp.func(func_name, arg, length)
+            empty_string = exp.Literal.string("")
+            case_expr = (
+                exp.case().when(length < exp.Literal.number(0), empty_string).else_(func_expr)
+            )
+            return self.sql(case_expr)
+        return self.func(func_name, arg, length)


Let's minimize this to something like:

def _left_right_sql(self, expression: exp.Left | exp.Right, func_name: str) -> str: arg = expression.this length = expression.expression is_binary = _is_binary(arg) if is_binary: hex_arg = exp.Hex(this=arg) # LEFT/RIGHT(blob, n) becomes UNHEX(LEFT/RIGHT(HEX(blob), n * 2)) # Each byte becomes 2 hex chars, so multiply length by 2 hex_length = exp.Mul(this=length, expression=exp.Literal.number(2)) result = exp.Unhex(this=self.func(func_name, hex_arg, hex_length)) else: result = exp.func(func_name, arg, length) if expression.args.get("negative_length_returns_empty"): empty = exp.Literal.string("") if is_binary: empty = exp.Unhex(this=empty) result = exp.case().when(length < exp.Literal.number(0), empty).else_(result) return self.sql(result)

…ew comments. minimized code.

- Add DuckDB roundtrip tests for LEFT/RIGHT with literals and columns - Consolidate and expand Snowflake LEFT/RIGHT transpilation tests - Test both VARCHAR and BINARY data types

geooo109 reviewed Mar 31, 2026

View reviewed changes

sqlglot/generators/duckdb.py Outdated Show resolved Hide resolved

fivetran-ashashankar self-assigned this Mar 31, 2026

fivetran-ashashankar force-pushed the RD-1147725-transpile-LEFT branch from e25281d to a4e768d Compare March 31, 2026 18:59

geooo109 reviewed Mar 31, 2026

View reviewed changes

sqlglot/generators/duckdb.py Outdated Show resolved Hide resolved

sqlglot/generators/duckdb.py Outdated Show resolved Hide resolved

fivetran-ashashankar added 5 commits April 1, 2026 10:00

feat(duckdb)!: support transpilation of function LEFT from Snowflake …

be19be9

…to DuckDB - Add left_sql() method to handle BINARY/BLOB arguments - Convert BLOB to HEX string, multiply length by 2, apply LEFT, convert back with UNHEX - For VARCHAR arguments, use native DuckDB LEFT function

feat(tests): update integration tests submodule for LEFT function

53cb060

feat(snowflake)!: Transpilation support for LEFT RIGHT functions . re…

e0911fc

…view comments incorporated.

feat(snowflake)!: Transpilation support for LEFT RIGHT functions Revi…

a5d9ce7

…ew comments. existing test fixes.

feat(snowflake)!: Transpilation support for LEFT RIGHT functions Revi…

b700525

…ew comments. exp.case used

fivetran-ashashankar force-pushed the RD-1147725-transpile-LEFT branch from 9a7b596 to b700525 Compare April 1, 2026 17:06

geooo109 reviewed Apr 1, 2026

View reviewed changes

fivetran-ashashankar added 2 commits April 1, 2026 11:24

feat(snowflake)!: Transpilation support for LEFT RIGHT functions Revi…

ff2a5f3

…ew comments. minimized code.

feat(tests): update integration tests for LEFT/RIGHT functions

649cb3d

- Add DuckDB roundtrip tests for LEFT/RIGHT with literals and columns - Consolidate and expand Snowflake LEFT/RIGHT transpilation tests - Test both VARCHAR and BINARY data types

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(duckdb)!: support transpilation of function LEFT from Snowflake …#7419

feat(duckdb)!: support transpilation of function LEFT from Snowflake …#7419
fivetran-ashashankar wants to merge 7 commits intomainfrom
RD-1147725-transpile-LEFT

fivetran-ashashankar commented Mar 30, 2026

Uh oh!

Uh oh!

geooo109 commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

fivetran-ashashankar commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

geooo109 Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fivetran-ashashankar commented Mar 30, 2026

Uh oh!

Uh oh!

geooo109 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 31, 2026

SQLGlot Integration Test Results

By Dialect

Overall

Uh oh!

fivetran-ashashankar commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

geooo109 Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

geooo109 commented Mar 31, 2026 •

edited

Loading

geooo109 Apr 1, 2026 •

edited

Loading