-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Trino query runner: get_schema() ignores schema configuration, fetches all schemas #7680
Copy link
Copy link
Open
Description
Issue Summary
When configuring a Trino data source with both catalog and schema specified, the schema browser still shows tables from all schemas in the catalog, not just the configured one.
This is because get_schema() only uses the catalog configuration but ignores schema:
# current code in redash/query_runner/trino.py
query = f"""
SELECT table_schema, table_name, column_name, data_type
FROM {catalog}.information_schema.columns
WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
"""Problem
For large catalogs (e.g., Iceberg with Glue metastore), querying information_schema.columns without a schema filter causes:
- Timeout/partial results — Glue API latency makes the full query exceed the trino Python client's default
request_timeout(30s), returning incomplete results - Unusable schema browser — Even when results return, showing all schemas makes the browser noisy when users only need one schema
Related: #6059
Proposed Fix
When schema is configured, add it as a WHERE condition:
def get_schema(self, get_stats=False):
if self.configuration.get("catalog"):
catalogs = [self.configuration.get("catalog")]
else:
catalogs = self._get_catalogs()
schema_filter = self.configuration.get("schema")
schema = {}
for catalog in catalogs:
query = f"""
SELECT table_schema, table_name, column_name, data_type
FROM {catalog}.information_schema.columns
WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
"""
if schema_filter:
query += f" AND table_schema = '{schema_filter}'"
results, error = self.run_query(query, None)
...Environment
- Redash 25.8.0 (also confirmed in latest v26.3.0 source)
- Trino with Iceberg connector + AWS Glue catalog
- 15 schemas, ~105 tables, ~4894 columns
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels