MCP Snowflake Server

A Model Context Protocol (MCP) server that connects to Snowflake databases and executes SQL queries.

Features

List Schemas: Retrieve a list of schemas from a specified database
List Tables: Retrieve a list of tables from a specified database and schema
List Views: Retrieve a list of views from a specified database and schema
Describe Table: Retrieve detailed structure information for a specified table
Execute Query: Execute read-only SQL queries and return results
Sample Table Data: Retrieve sample data from a specified table using Snowflake's SAMPLE ROW clause
Analyze Table Statistics: Generate comprehensive statistical analysis for table columns using Snowflake's high-performance approximation functions (supports numeric, string, date, and boolean columns)
Profile Semi-Structured Columns: Profile VARIANT/ARRAY/OBJECT columns with sampled flatten-based analysis

Installation

Prerequisites

Python 3.13 or higher
uv (Python package manager)
Access to a Snowflake account

Setup

Clone the repository:

git clone <repository-url>
cd mcp-snowflake

Install using uv:

uv tool install -e .

Configuration

Using Configuration File (Recommended)

Copy the configuration file sample:

cp .mcp_snowflake.toml.example .mcp_snowflake.toml

Edit .mcp_snowflake.toml to configure your Snowflake connection:

[snowflake]
account = "your-account.region"
user = "your-username"
password = "your-password"  # Required only when authenticator = "SNOWFLAKE"
warehouse = "your-warehouse"  # Optional
role = "your-role"  # Optional
secondary_roles = ["analyst_role", "bi_reader_role"]  # Optional: role names for USE SECONDARY ROLES
# secondary_roles = ["NONE"]  # Optional: explicitly disable secondary roles
authenticator = "SNOWFLAKE"  # "SNOWFLAKE" or "externalbrowser"
client_store_temporary_credential = true  # ID token cache for externalbrowser

[tools]
# Enable/disable specific tools (all enabled by default)
analyze_table_statistics = true  # Optional
describe_table = true  # Optional
execute_query = true  # Optional
list_schemas = true  # Optional
list_tables = true  # Optional
list_views = true  # Optional
profile_semi_structured_columns = true  # Optional
sample_table_data = true  # Optional

[execute_query]
# Default timeout_seconds when not specified by the caller (default: 30, must be <= timeout_seconds_max)
timeout_seconds_default = 30  # Optional
# Maximum value accepted by execute_query.timeout_seconds (default: 300, max: 3600)
timeout_seconds_max = 300  # Optional
# Enable compact text format for small result sets to reduce LLM token usage (default: false)
compact_format_enabled = false  # Optional
# Row count threshold: results with this many rows or fewer use compact format (default: 5)
compact_format_threshold = 5  # Optional

[analyze_table_statistics]
# Query timeout for analyze_table_statistics (default: 60)
query_timeout_seconds = 60  # Optional (max: 3600)

[profile_semi_structured_columns]
# Query timeouts for semi-structured profiling (default base: 90, path: 180)
base_query_timeout_seconds = 90  # Optional (max: 3600)
path_query_timeout_seconds = 180  # Optional (max: 3600, must be >= base_query_timeout_seconds)

snowflake.secondary_roles behavior:

null / omitted: Do not execute USE SECONDARY ROLES.
["NONE"]: Execute USE SECONDARY ROLES NONE for each new session.
["ROLE_A", "ROLE_B"]: Execute USE SECONDARY ROLES "ROLE_A", "ROLE_B" for each new session.

Using Environment Variables

Set the following environment variables:

Required

SNOWFLAKE__ACCOUNT: Snowflake account identifier
SNOWFLAKE__USER: Username

Optional

SNOWFLAKE__PASSWORD: Password (required when SNOWFLAKE__AUTHENTICATOR=SNOWFLAKE)
SNOWFLAKE__WAREHOUSE: Default warehouse
SNOWFLAKE__ROLE: Default role
SNOWFLAKE__SECONDARY_ROLES: JSON array for secondary roles (e.g. ["ROLE_A","ROLE_B"] or ["NONE"])
SNOWFLAKE__AUTHENTICATOR: Authentication method ("SNOWFLAKE" or "externalbrowser")
SNOWFLAKE__CLIENT_STORE_TEMPORARY_CREDENTIAL: Enable ID token cache for browser SSO ("true" or "false", default: "true")

Tool Configuration (Optional)

TOOLS__ANALYZE_TABLE_STATISTICS: Enable/disable analyze_table_statistics tool ("true" or "false", default: "true")
TOOLS__DESCRIBE_TABLE: Enable/disable describe_table tool ("true" or "false", default: "true")
TOOLS__EXECUTE_QUERY: Enable/disable execute_query tool ("true" or "false", default: "true")
TOOLS__LIST_SCHEMAS: Enable/disable list_schemas tool ("true" or "false", default: "true")
TOOLS__LIST_TABLES: Enable/disable list_tables tool ("true" or "false", default: "true")
TOOLS__LIST_VIEWS: Enable/disable list_views tool ("true" or "false", default: "true")
TOOLS__PROFILE_SEMI_STRUCTURED_COLUMNS: Enable/disable profile_semi_structured_columns tool ("true" or "false", default: "true")
TOOLS__SAMPLE_TABLE_DATA: Enable/disable sample_table_data tool ("true" or "false", default: "true")
EXECUTE_QUERY__TIMEOUT_SECONDS_DEFAULT: Default timeout_seconds for execute_query when not specified by the caller (default: 30, must be <= timeout_seconds_max)
EXECUTE_QUERY__TIMEOUT_SECONDS_MAX: Maximum allowed timeout_seconds for execute_query (default: 300, max: 3600). Values above 3600 fail server startup.
EXECUTE_QUERY__COMPACT_FORMAT_ENABLED: Enable compact text format for small result sets ("true" or "false", default: "false")
EXECUTE_QUERY__COMPACT_FORMAT_THRESHOLD: Row count threshold for compact format (integer, default: 5)
ANALYZE_TABLE_STATISTICS__QUERY_TIMEOUT_SECONDS: Query timeout for analyze_table_statistics (default: 60, max: 3600)
PROFILE_SEMI_STRUCTURED_COLUMNS__BASE_QUERY_TIMEOUT_SECONDS: Base query timeout for profile_semi_structured_columns (default: 90, max: 3600)
PROFILE_SEMI_STRUCTURED_COLUMNS__PATH_QUERY_TIMEOUT_SECONDS: Path query timeout for profile_semi_structured_columns (default: 180, max: 3600, must be >= base_query_timeout_seconds)

Example:

export SNOWFLAKE__ACCOUNT="your-account.region"
export SNOWFLAKE__USER="your-username"
export SNOWFLAKE__PASSWORD="your-password"
export SNOWFLAKE__WAREHOUSE="your-warehouse"
export SNOWFLAKE__ROLE="your-role"
export SNOWFLAKE__SECONDARY_ROLES='["ROLE_A","ROLE_B"]'
export SNOWFLAKE__AUTHENTICATOR="SNOWFLAKE"
export SNOWFLAKE__CLIENT_STORE_TEMPORARY_CREDENTIAL="true"

For PowerShell (Windows):

$env:SNOWFLAKE__ACCOUNT="your-account.region"
$env:SNOWFLAKE__USER="your-username"
$env:SNOWFLAKE__PASSWORD="your-password"
$env:SNOWFLAKE__WAREHOUSE="your-warehouse"
$env:SNOWFLAKE__ROLE="your-role"
$env:SNOWFLAKE__SECONDARY_ROLES='["ROLE_A","ROLE_B"]'
$env:SNOWFLAKE__AUTHENTICATOR="SNOWFLAKE"
$env:SNOWFLAKE__CLIENT_STORE_TEMPORARY_CREDENTIAL="true"

# Tool configuration (optional)
$env:TOOLS__EXECUTE_QUERY="false"  # Disable execute_query tool
$env:TOOLS__ANALYZE_TABLE_STATISTICS="false"  # Disable analyze_table_statistics tool

Note

Environment variables are separated by double underscores (__).

Browser-based SSO (externalbrowser) with ID token cache

To use browser-based SSO, set authenticator = "externalbrowser" and keep client_store_temporary_credential = true (default). The connector stores a temporary ID token in secure local storage and can reuse it for subsequent connections.

[snowflake]
account = "your-account.region"
user = "your-username"
warehouse = "your-warehouse"
role = "your-role"
secondary_roles = ["analyst_role"]
authenticator = "externalbrowser"
client_store_temporary_credential = true

Notes:

The first connection opens a browser for SSO sign-in.
Your Snowflake account must have ALLOW_ID_TOKEN enabled.
Install this package with snowflake-connector-python[secure-local-storage] support.

Usage

Start the MCP server:

uvx mcp-snowflake --config {your-config-path}

Available Tools

Tool List

list_schemas - Retrieve a list of schemas from a specified database
list_tables - Retrieve a list of tables from a specified database and schema
list_views - Retrieve a list of views from a specified database and schema
describe_table - Retrieve detailed structure information for a specified table
execute_query - Execute read-only SQL queries and return structured results
sample_table_data - Retrieve sample data from a specified table
analyze_table_statistics - Generate comprehensive statistical analysis for table columns
profile_semi_structured_columns - Profile VARIANT/ARRAY/OBJECT columns and nested paths

list_schemas

Retrieve a list of schemas from a specified database.

Parameters:

database (string, required): Database name to retrieve schemas from

Example:

{
  "name": "list_schemas",
  "arguments": {
    "database": "MY_DATABASE"
  }
}

list_tables

Retrieve a list of tables from a specified database and schema.

Parameters:

database (string, required): Database name to retrieve tables from
schema (string, required): Schema name to retrieve tables from
filter (object, optional): Name filter
- type (string, required): Filter type (contains)
- value (string, required): Substring to match in table names (case-insensitive)

Example:

{
  "name": "list_tables",
  "arguments": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "filter": {
      "type": "contains",
      "value": "order"
    }
  }
}

list_views

Retrieve a list of views from a specified database and schema.

Parameters:

database (string, required): Database name to retrieve views from
schema (string, required): Schema name to retrieve views from
filter (object, optional): Name filter
- type (string, required): Filter type (contains)
- value (string, required): Substring to match in view names (case-insensitive)

Example:

{
  "name": "list_views",
  "arguments": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "filter": {
      "type": "contains",
      "value": "order"
    }
  }
}

describe_table

Retrieve detailed structure information (columns, data types, etc.) for a specified table.

Parameters:

database (string, required): Database name containing the table
schema (string, required): Schema name containing the table
table (string, required): Name of the table to describe

Example:

{
  "name": "describe_table",
  "arguments": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "table": "CUSTOMERS"
  }
}

Response Format: The describe_table tool returns a structured JSON format:

{
  "table_info": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "name": "CUSTOMERS",
    "column_count": 4,
    "columns": [
      {
        "name": "ID",
        "data_type": "NUMBER(38,0)",
        "nullable": false,
        "default_value": null,
        "comment": "Primary key",
        "ordinal_position": 1
      }
    ]
  }
}

execute_query

Execute read-only SQL queries and return structured results. Only SELECT, SHOW, DESCRIBE, EXPLAIN and similar read operations are allowed.

Parameters:

sql (string, required): SQL query to execute (read operations only)
timeout_seconds (integer, optional): Query timeout in seconds (default: 30, max: execute_query.timeout_seconds_max)

Example:

{
  "name": "execute_query",
  "arguments": {
    "sql": "SELECT * FROM customers LIMIT 10",
    "timeout_seconds": 60
  }
}

Response Format:

The response format depends on the compact_format_enabled configuration.

JSON format (default, or when row count exceeds compact_format_threshold):

{
  "query_result": {
    "execution_time_ms": 150,
    "row_count": 2,
    "columns": ["id", "name", "email"],
    "rows": [
      {"id": 1, "name": "John", "email": "john@example.com"},
      {"id": 2, "name": "Jane", "email": "jane@example.com"}
    ],
    "warnings": []
  }
}

Compact format (when compact_format_enabled = true and row count <= compact_format_threshold):

execution_time_ms: 150
row_count: 2

row1:
id: 1
name: "John"
email: "john@example.com"

row2:
id: 2
name: "Jane"
email: "jane@example.com"

The compact format reduces LLM token usage for small result sets (e.g., aggregations, statistics). Semi-structured values (objects/arrays) are rendered as inline JSON even in compact format. String values are always JSON-escaped (e.g., "john@example.com", "line1\\nline2") to keep the compact row structure parseable.

sample_table_data

Retrieve sample data from a specified table using Snowflake's SAMPLE ROW clause for efficient data sampling.

Parameters:

database (string, required): Database name containing the table
schema (string, required): Schema name containing the table
table (string, required): Name of the table to sample
sample_size (integer, optional): Number of sample rows to retrieve (default: 10, minimum: 1)
columns (array of strings, optional): List of column names to retrieve (if not specified, all columns will be retrieved)

Example:

{
  "name": "sample_table_data",
  "arguments": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "table": "ORDERS",
    "sample_size": 5,
    "columns": ["order_id", "customer_id", "total"]
  }
}

analyze_table_statistics

Analyze table statistics using Snowflake's high-performance approximation functions (APPROX_PERCENTILE, APPROX_TOP_K, APPROX_COUNT_DISTINCT) to efficiently retrieve statistical information for numeric, string, date, and boolean columns.

Parameters:

database (string, required): Database name containing the table
schema (string, required): Schema name containing the table
table (string, required): Name of the table to analyze
columns (array of strings, optional): List of column names to analyze (if not specified, all columns will be analyzed)
top_k_limit (integer, optional): Number of top values to retrieve for string columns (default: 10, max: 100)
include_null_empty_profile (boolean, optional): Include per-column quality profile (default: true)
include_blank_string_profile (boolean, optional): Include TRIM-based blank string profile for STRING columns (default: false)

Example:

{
  "name": "analyze_table_statistics",
  "arguments": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "table": "SALES_DATA",
    "columns": ["amount", "region", "order_date"],
    "top_k_limit": 5,
    "include_null_empty_profile": true,
    "include_blank_string_profile": false
  }
}

Response Format: Returns comprehensive statistics tailored to each column type:

Numeric columns: count, min, max, avg, percentiles (25th, 50th, 75th), distinct count
String columns: count, min/max length, distinct count, top K most frequent values
Date columns: count, min/max dates, date range in days, distinct count
Boolean columns: count, true/false counts and percentages (both NULL-inclusive and NULL-exclusive)
Quality profile (quality_profile): optional per-column quality metrics
- All types: null_count, null_ratio (null_count / total_rows)
- String only: empty_string_count, empty_string_ratio (empty_string_count / non_null_rows)
- String only (optional): blank_string_count, blank_string_ratio (blank_string_count / non_null_rows)
- All ratios return 0.0 when denominator is 0
Statistics metadata (statistics_metadata): optional execution semantics
- quality_profile_counting_mode: exact
- distribution_metrics_mode: approximate

Timeout Configuration:

analyze_table_statistics.query_timeout_seconds (default: 60)

profile_semi_structured_columns

Profile semi-structured columns (VARIANT, ARRAY, OBJECT) using sampled recursive flatten analysis.

Parameters:

database (string, required): Database name containing the table
schema (string, required): Schema name containing the table
table (string, required): Name of the table to profile
columns (array of strings, optional): Target columns (empty means auto-select all semi-structured columns)
sample_rows (integer, optional): Sample row count (default: 10000, min: 1, max: 200000)
max_depth (integer, optional): Maximum recursive path depth (default: 4, min: 1, max: 20)
top_k_limit (integer, optional): Top-k limit for frequent values and keys (default: 20, min: 1, max: 100)
include_path_stats (boolean, optional): Include path-level profiling (default: true)
include_value_samples (boolean, optional): Include path-level top_values samples (default: false)

Example:

{
  "name": "profile_semi_structured_columns",
  "arguments": {
    "database": "MY_DATABASE",
    "schema": "PUBLIC",
    "table": "EVENT_LOGS",
    "sample_rows": 10000,
    "max_depth": 4,
    "top_k_limit": 20,
    "include_path_stats": true
  }
}

Response Format:

profile_info: table metadata, row counts, analyzed columns
column_profiles: null ratio, top-level type distribution, array length stats, top-level keys top-k
path_profiles: path-wise type distribution, approx distinct count, null ratio, optional top values
warnings: sampling/depth-limit and approximation notes

Timeout Configuration:

profile_semi_structured_columns.base_query_timeout_seconds (default: 90)
profile_semi_structured_columns.path_query_timeout_seconds (default: 180)

Development

Development Environment Setup

uv sync --all-groups --all-packages

Code Formatting

uv run ruff format .
uv run ruff check --fix .

Code Testing

uv run pytest --doctest-modules .

Troubleshooting

Connection Errors

Verify that configuration file or environment variables are correctly set
Check that Snowflake account and username are correct
If authenticator=SNOWFLAKE, verify password is set correctly
Verify network connectivity

Permission Errors

Ensure the specified user has permission to access the database
Set the ROLE if necessary

Configuration Priority

Settings are loaded in the following order (later settings take precedence):

Configuration file (.mcp_snowflake.toml)
Environment variables

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.vscode		.vscode
packages		packages
src/mcp_snowflake		src/mcp_snowflake
stub/sqlparse		stub/sqlparse
tests		tests
.gitignore		.gitignore
.mcp_snowflake.toml.example		.mcp_snowflake.toml.example
.python-version		.python-version
.ruff.toml		.ruff.toml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

yuly3/mcp-snowflake

Folders and files

Latest commit

History

Repository files navigation

MCP Snowflake Server

Features

Installation

Prerequisites

Setup

Configuration

Using Configuration File (Recommended)

Using Environment Variables

Required

Optional

Tool Configuration (Optional)

Browser-based SSO (externalbrowser) with ID token cache

Usage

Available Tools

Tool List

list_schemas

list_tables

list_views

describe_table

execute_query

sample_table_data

analyze_table_statistics

profile_semi_structured_columns

Development

Development Environment Setup

Code Formatting

Code Testing

Troubleshooting

Connection Errors

Permission Errors

Configuration Priority

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages