Skip to content

SNOW-3066557: UnicodeDecodeError when executing SQL files with UTF-8 encoding on Japanese Windows #2759

@KazunoriMatsuzawa

Description

@KazunoriMatsuzawa

SnowCLI version

3.15.0

Python version

3.11 (embedded in PyApp)

Platform

Windows 10/11 (Japanese locale, CP932 default encoding)

What happened

Snowflake CLI fails to execute SQL files containing non-ASCII characters (Japanese comments) with UTF-8 encoding on Japanese Windows environment, despite setting PYTHONUTF8=1 and PYTHONIOENCODING=utf-8.

Console output

Actual Result:
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 88: illegal multibyte sequence

Error Traceback:
File "...\snowflake\cli\_plugins\sql\statement_reader.py", line 233, in files_reader
    stmts = split_statements(io.StringIO(f.read()), remove_comments)
UnicodeDecodeError: 'cp932' codec can't decode byte 0x86 in position 88: illegal multibyte sequence

How to reproduce

Steps to Reproduce:

Create a SQL file with UTF-8 encoding containing Japanese comments:
Save the file as test.sql (UTF-8 encoding)

Run the command:

Actual Result:

Error Traceback:

Expected Result:
SQL file should be read with UTF-8 encoding and executed successfully.

What I've Tried:

✅ Set PowerShell to UTF-8 (chcp 65001) - No effect
✅ Set environment variables PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 - No effect
✅ Modified PowerShell profile with UTF-8 settings - No effect
✅ Removed Japanese comments from SQL file - Workaround successful
Root Cause Analysis:

The issue occurs in statement_reader.py:233 where SecurePath.read() is called without specifying encoding. On Japanese Windows, the default encoding is CP932, not UTF-8.

The environment variables PYTHONUTF8 and PYTHONIOENCODING don't affect PyApp-bundled Python executables, as they use their own embedded Python runtime.

Proposed Solution:

Explicitly specify UTF-8 encoding when opening SQL files in statement_reader.py:

Or add encoding parameter support to SecurePath.read_text() method.

Related Documentation:

Python's open() encoding parameter
Similar issue was resolved for icacls command outputs (SnowflakeCLI_EncodingError.md in user documentation)
Workaround:
Remove non-ASCII characters from SQL files or use SnowSQL instead of Snowflake CLI.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions