Skip to content

Migrate Databricks from sqlalchemy-databricks to databricks-sqlalchemy#26896

Open
ulixius9 wants to merge 2 commits intomainfrom
databricks_sqa_update
Open

Migrate Databricks from sqlalchemy-databricks to databricks-sqlalchemy#26896
ulixius9 wants to merge 2 commits intomainfrom
databricks_sqa_update

Conversation

@ulixius9
Copy link
Copy Markdown
Member

@ulixius9 ulixius9 commented Mar 31, 2026

Summary

  • Migrate Databricks connectors (Databricks, Unity Catalog, Databricks Pipeline) from unmaintained sqlalchemy-databricks==0.2.0 (pyhive-based) to official databricks-sqlalchemy~=2.0.9 with native SQLAlchemy 2.0 support
  • Update connection URL scheme from databricks+connector to databricks across JSON schemas, generated models, frontend types, and Flyway migrations
  • Replace pyhive HiveCompiler references with SQLCompiler/DatabricksStatementCompiler in profiler interface
  • Pass catalog as URL query parameter (?catalog=) so the new dialect's internal methods (get_pk_constraint, _describe_table_extended) resolve the catalog correctly
  • Fix Row.values()tuple(result) for SQLAlchemy 2.0 Row compatibility in table/schema comment extraction
  • Fix Column._set_parent() to pass required all_names and allow_replacements kwargs for SQLAlchemy 2.0
  • Fix USE CATALOG :catalog parameterized DDL → literal USE CATALOG for NATIVE paramstyle compatibility
  • Suppress upstream _user_agent_entry deprecation warning from databricks-sqlalchemy

Test plan

  • Unit tests for connection URL generation (Databricks, Unity Catalog, Pipeline) with and without catalog
  • Unit tests for _type_map completeness and complex type registration
  • Unit tests for DatabricksBaseTableParameter default scheme
  • Unit tests for profiler visit_column/visit_table with new compiler class
  • E2E metadata ingestion validated against live Databricks workspace (all table types including struct/array/map)
  • E2E profiler validated against live Databricks workspace
  • Verify Hive connector unaffected (pyhive remains in hive extras)
  • Verify Unity Catalog metadata ingestion
  • Verify Databricks Pipeline connector ingests jobs

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 31, 2026 18:35
@ulixius9 ulixius9 requested review from a team, akash-jain-10, harshach and tutte as code owners March 31, 2026 18:35
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Mar 31, 2026
Comment on lines +140 to +143
url = f"{connection.scheme.value}://{connection.hostPort}"
if connection.catalog:
url = f"{url}?catalog={connection.catalog}"
return url
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Edge Case: Catalog value not URL-encoded in connection URL

The get_connection_url functions in both Databricks and Unity Catalog connection modules directly interpolate connection.catalog into the URL query string without URL-encoding. If a catalog name contains special characters (&, =, #, %, spaces), this will produce a malformed URL or cause SQLAlchemy to misparse the query parameters.

The codebase already uses quote_plus for URL parameters elsewhere (e.g., get_connection_url_common in builders.py).

Suggested fix:

from urllib.parse import quote_plus

def get_connection_url(connection) -> str:
    url = f"{connection.scheme.value}://{connection.hostPort}"
    if connection.catalog:
        url = f"{url}?catalog={quote_plus(connection.catalog)}"
    return url

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +61 to +63
# Suppress noisy deprecation warning from databricks-sqlalchemy using
# the deprecated '_user_agent_entry' parameter internally
logging.getLogger("databricks.sql.session").setLevel(logging.ERROR)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Module-level log suppression is too broad

Setting logging.getLogger('databricks.sql.session').setLevel(logging.ERROR) at module level suppresses all WARNING and INFO messages from the Databricks SQL session logger globally and permanently, not just the _user_agent_entry deprecation warning. This could hide legitimate warnings about connection issues, timeouts, or other important diagnostic information from the driver.

A more targeted approach would use a warnings.filterwarnings call to suppress only the specific deprecation warning.

Suggested fix:

import warnings

warnings.filterwarnings(
    "ignore",
    message=".*_user_agent_entry.*",
    category=DeprecationWarning,
)

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates OpenMetadata’s Databricks-related connectors (Databricks, Unity Catalog, Databricks Pipeline) from the unmaintained sqlalchemy-databricks dialect to the official databricks-sqlalchemy dialect, updating connection URL scheme semantics and adapting profiler/ingestion logic for SQLAlchemy 2.0 compatibility.

Changes:

  • Updated Databricks/Unity Catalog connection scheme from databricks+connector to databricks across JSON schemas, ingestion code, unit tests, and DB migrations.
  • Adjusted connection URL generation to pass catalog via URL query parameter, and updated profiler compiler integration away from PyHive.
  • Updated ingestion internals for SQLAlchemy 2.0 compatibility (Row handling, Column parenting) and removed legacy dialect preinstalls from CI images/actions.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/unityCatalogConnection.json Updates Unity Catalog scheme enum/default to databricks.
openmetadata-spec/src/main/resources/json/schema/entity/services/connections/database/databricksConnection.json Updates Databricks scheme enum/default to databricks.
openmetadata-spec/src/main/resources/json/schema/entity/applications/configuration/external/metadataExporterConnectors/databricksConnection.json Aligns exporter connector schema scheme enum/default to databricks.
ingestion/tests/unit/topology/database/test_databricks.py Updates expected scheme/URL in Databricks unit tests.
ingestion/tests/unit/topology/database/test_databricks_migration.py Adds unit coverage for scheme enum, _type_map, default scheme, and pipeline URL scheme.
ingestion/tests/unit/test_source_connection.py Updates Databricks URL expectations; adds Unity Catalog and pipeline URL coverage including catalog query param.
ingestion/tests/unit/observability/profiler/sqlalchemy/databricks/test_visit_column.py Switches compiler mocking from PyHive HiveCompiler to SQLAlchemy SQLCompiler.
ingestion/src/metadata/profiler/interface/sqlalchemy/databricks/profiler_interface.py Updates compiler integration to work with databricks-sqlalchemy statement compiler and SQLAlchemy 2.0.
ingestion/src/metadata/mixins/sqalchemy/sqa_mixin.py Changes Databricks/Unity Catalog catalog selection DDL execution.
ingestion/src/metadata/ingestion/source/pipeline/databrickspipeline/connection.py Updates pipeline connection URL scheme to databricks and adds log suppression.
ingestion/src/metadata/ingestion/source/database/unitycatalog/connection.py Appends catalog as URL query param and adds log suppression.
ingestion/src/metadata/ingestion/source/database/databricks/metadata.py Replaces PyHive _type_map; updates dialect import; fixes SQLAlchemy 2.0 Row iteration for comments/descriptions.
ingestion/src/metadata/ingestion/source/database/databricks/connection.py Appends catalog as URL query param and adds log suppression.
ingestion/src/metadata/ingestion/source/database/common/data_diff/databricks_base.py Updates default scheme fallback from databricks+connector to databricks.
ingestion/setup.py Adds databricks-sqlalchemy dependency; updates connector versions; removes PyHive from databricks extra.
ingestion/operators/docker/Dockerfile.ci Removes preinstall of legacy sqlalchemy-databricks dialect.
ingestion/Dockerfile.ci Removes preinstall of legacy sqlalchemy-databricks dialect.
bootstrap/sql/migrations/native/1.13.0/postgres/schemaChanges.sql Migrates stored Databricks/UnityCatalog scheme values to databricks in Postgres.
bootstrap/sql/migrations/native/1.13.0/mysql/schemaChanges.sql Migrates stored Databricks/UnityCatalog scheme values to databricks in MySQL.
.github/actions/setup-openmetadata-test-environment/action.yml Removes preinstall of legacy sqlalchemy-databricks in test environment setup.

Comment on lines 92 to 98
if isinstance(
self.service_connection_config,
(UnityCatalogConnection, DatabricksConnection),
):
session.execute(
text("USE CATALOG :catalog"),
{"catalog": self.service_connection_config.catalog},
).first()
catalog = self.service_connection_config.catalog
session.execute(text(f"USE CATALOG `{catalog}`"))

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_catalog now always executes USE CATALOG even when catalog is unset, which will run USE CATALOG None`` for connections without a catalog. Also, interpolating catalog directly into SQL is unsafe (identifier quoting / injection) and inconsistent with other Databricks/Unity Catalog codepaths that use the dialect identifier preparer to quote identifiers safely. Consider (a) guarding on a truthy `catalog` and (b) quoting via `session.bind.dialect.identifier_preparer.quote(...)` (or equivalent) and escaping backticks rather than direct f-string interpolation.

Copilot uses AI. Check for mistakes.
Comment on lines 139 to +143
def get_connection_url(connection: DatabricksConnection) -> str:
return f"{connection.scheme.value}://{connection.hostPort}"
url = f"{connection.scheme.value}://{connection.hostPort}"
if connection.catalog:
url = f"{url}?catalog={connection.catalog}"
return url
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building the connection URL with ?catalog={connection.catalog} does not URL-encode the catalog value. Catalog names containing spaces or reserved URL characters will produce an invalid URL and can break SQLAlchemy parsing. Consider using urllib.parse.urlencode/quote when appending query parameters.

Copilot uses AI. Check for mistakes.
Comment on lines 71 to 75
def get_connection_url(connection: UnityCatalogConnection) -> str:
url = f"{connection.scheme.value}://{connection.hostPort}"
if connection.catalog:
url = f"{url}?catalog={connection.catalog}"
return url
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building the connection URL with ?catalog={connection.catalog} does not URL-encode the catalog value. Catalog names containing spaces or reserved URL characters will produce an invalid URL and can break SQLAlchemy parsing. Consider using urllib.parse.urlencode/quote when appending query parameters.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +63
# Suppress noisy deprecation warning from databricks-sqlalchemy using
# the deprecated '_user_agent_entry' parameter internally
logging.getLogger("databricks.sql.session").setLevel(logging.ERROR)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting databricks.sql.session logger level at import time is a global side effect (affects all callers) and may hide useful INFO/WARN logs for debugging Databricks connectivity issues. If the goal is to suppress a specific deprecation warning, prefer filtering the specific warning/message (e.g., warnings.filterwarnings) or applying a targeted log filter closer to connection initialization rather than changing the logger’s level module-wide.

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +68
# Suppress noisy deprecation warning from databricks-sqlalchemy using
# the deprecated '_user_agent_entry' parameter internally
logging.getLogger("databricks.sql.session").setLevel(logging.ERROR)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting databricks.sql.session logger level at import time is a global side effect (affects all callers) and may hide useful INFO/WARN logs for debugging Databricks connectivity issues. If the goal is to suppress a specific deprecation warning, prefer filtering the specific warning/message (e.g., warnings.filterwarnings) or applying a targeted log filter closer to connection initialization rather than changing the logger’s level module-wide.

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +40
# Suppress noisy deprecation warning from databricks-sqlalchemy using
# the deprecated '_user_agent_entry' parameter internally
logging.getLogger("databricks.sql.session").setLevel(logging.ERROR)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting databricks.sql.session logger level at import time is a global side effect (affects all callers) and may hide useful INFO/WARN logs for debugging Databricks connectivity issues. If the goal is to suppress a specific deprecation warning, prefer filtering the specific warning/message (e.g., warnings.filterwarnings) or applying a targeted log filter closer to connection initialization rather than changing the logger’s level module-wide.

Copilot uses AI. Check for mistakes.
Comment on lines +103 to 111
from databricks.sqlalchemy._ddl import DatabricksStatementCompiler

DatabricksStatementCompiler.visit_column = (
DatabricksProfilerInterface.visit_column
)
DatabricksStatementCompiler.visit_table = (
DatabricksProfilerInterface.visit_table
)

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing DatabricksStatementCompiler from databricks.sqlalchemy._ddl relies on a private module (_ddl). This is fragile across databricks-sqlalchemy upgrades and can break at runtime if internals move. Consider guarding the import (try/except) and/or retrieving the statement compiler class from the dialect/engine in a supported way if available.

Suggested change
from databricks.sqlalchemy._ddl import DatabricksStatementCompiler
DatabricksStatementCompiler.visit_column = (
DatabricksProfilerInterface.visit_column
)
DatabricksStatementCompiler.visit_table = (
DatabricksProfilerInterface.visit_table
)
# Override the Databricks statement compiler's visit methods to handle
# struct columns and table names more robustly. Instead of importing
# DatabricksStatementCompiler from a private module, retrieve the
# active statement compiler class from the dialect in a supported way.
try:
bind = getattr(self.session, "bind", None)
dialect = getattr(bind, "dialect", None)
compiler_cls = getattr(dialect, "statement_compiler", None)
if compiler_cls is not None:
compiler_cls.visit_column = DatabricksProfilerInterface.visit_column
compiler_cls.visit_table = DatabricksProfilerInterface.visit_table
else:
logger.debug(
"DatabricksProfilerInterface: dialect has no statement_compiler; "
"skipping compiler monkey-patching."
)
except Exception as exc: # Defensive: do not break initialization
logger.debug(
"DatabricksProfilerInterface: failed to patch statement compiler: %r",
exc,
)

Copilot uses AI. Check for mistakes.
harshach
harshach previously approved these changes Mar 31, 2026
@github-actions
Copy link
Copy Markdown
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions bot requested a review from a team as a code owner March 31, 2026 18:41
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Mar 31, 2026

Code Review ⚠️ Changes requested 0 resolved / 2 findings

Migrates Databricks from sqlalchemy-databricks to databricks-sqlalchemy but the catalog value is not URL-encoded in the connection URL, which will break connections with special characters. Additionally, module-level log suppression is configured too broadly.

⚠️ Edge Case: Catalog value not URL-encoded in connection URL

📄 ingestion/src/metadata/ingestion/source/database/databricks/connection.py:140-143 📄 ingestion/src/metadata/ingestion/source/database/unitycatalog/connection.py:71-75

The get_connection_url functions in both Databricks and Unity Catalog connection modules directly interpolate connection.catalog into the URL query string without URL-encoding. If a catalog name contains special characters (&, =, #, %, spaces), this will produce a malformed URL or cause SQLAlchemy to misparse the query parameters.

The codebase already uses quote_plus for URL parameters elsewhere (e.g., get_connection_url_common in builders.py).

Suggested fix
from urllib.parse import quote_plus

def get_connection_url(connection) -> str:
    url = f"{connection.scheme.value}://{connection.hostPort}"
    if connection.catalog:
        url = f"{url}?catalog={quote_plus(connection.catalog)}"
    return url
💡 Quality: Module-level log suppression is too broad

📄 ingestion/src/metadata/ingestion/source/database/databricks/connection.py:61-63

Setting logging.getLogger('databricks.sql.session').setLevel(logging.ERROR) at module level suppresses all WARNING and INFO messages from the Databricks SQL session logger globally and permanently, not just the _user_agent_entry deprecation warning. This could hide legitimate warnings about connection issues, timeouts, or other important diagnostic information from the driver.

A more targeted approach would use a warnings.filterwarnings call to suppress only the specific deprecation warning.

Suggested fix
import warnings

warnings.filterwarnings(
    "ignore",
    message=".*_user_agent_entry.*",
    category=DeprecationWarning,
)
🤖 Prompt for agents
Code Review: Migrates Databricks from sqlalchemy-databricks to databricks-sqlalchemy but the catalog value is not URL-encoded in the connection URL, which will break connections with special characters. Additionally, module-level log suppression is configured too broadly.

1. ⚠️ Edge Case: Catalog value not URL-encoded in connection URL
   Files: ingestion/src/metadata/ingestion/source/database/databricks/connection.py:140-143, ingestion/src/metadata/ingestion/source/database/unitycatalog/connection.py:71-75

   The `get_connection_url` functions in both Databricks and Unity Catalog connection modules directly interpolate `connection.catalog` into the URL query string without URL-encoding. If a catalog name contains special characters (`&`, `=`, `#`, `%`, spaces), this will produce a malformed URL or cause SQLAlchemy to misparse the query parameters.
   
   The codebase already uses `quote_plus` for URL parameters elsewhere (e.g., `get_connection_url_common` in `builders.py`).

   Suggested fix:
   from urllib.parse import quote_plus
   
   def get_connection_url(connection) -> str:
       url = f"{connection.scheme.value}://{connection.hostPort}"
       if connection.catalog:
           url = f"{url}?catalog={quote_plus(connection.catalog)}"
       return url

2. 💡 Quality: Module-level log suppression is too broad
   Files: ingestion/src/metadata/ingestion/source/database/databricks/connection.py:61-63

   Setting `logging.getLogger('databricks.sql.session').setLevel(logging.ERROR)` at module level suppresses all WARNING and INFO messages from the Databricks SQL session logger globally and permanently, not just the `_user_agent_entry` deprecation warning. This could hide legitimate warnings about connection issues, timeouts, or other important diagnostic information from the driver.
   
   A more targeted approach would use a `warnings.filterwarnings` call to suppress only the specific deprecation warning.

   Suggested fix:
   import warnings
   
   warnings.filterwarnings(
       "ignore",
       message=".*_user_agent_entry.*",
       category=DeprecationWarning,
   )

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Copy Markdown
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Copy Markdown
Contributor

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (37)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http CVE-2026-33870 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.10.Final
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 CVE-2026-33871 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.11.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (13)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.7 3.1.8
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (37)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http CVE-2026-33870 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.10.Final
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 CVE-2026-33871 🚨 HIGH 4.1.96.Final 4.1.132.Final, 4.2.11.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (24)

Package Vulnerability ID Severity Installed Version Fixed Version
Authlib CVE-2026-27962 🔥 CRITICAL 1.6.6 1.6.9
Authlib CVE-2026-28490 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28498 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28802 🚨 HIGH 1.6.6 1.6.7
PyJWT CVE-2026-32597 🚨 HIGH 2.11.0 2.12.0
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.7 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.7 3.1.8
apache-airflow-providers-http CVE-2025-69219 🚨 HIGH 5.6.4 6.0.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
pyasn1 CVE-2026-30922 🚨 HIGH 0.6.2 0.6.3
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
tornado CVE-2026-31958 🚨 HIGH 6.5.4 6.5.5
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (2)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.6 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2026-25679 🚨 HIGH v1.25.6 1.25.8, 1.26.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@github-actions
Copy link
Copy Markdown
Contributor

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 64%
64.85% (58803/90674) 44.42% (30869/69489) 47.67% (9333/19576)

@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown
Contributor

🔴 Playwright Results — 1 failure(s), 17 flaky

✅ 2827 passed · ❌ 1 failed · 🟡 17 flaky · ⏭️ 194 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 453 0 2 2
🔴 Shard 3 613 1 3 30
🟡 Shard 4 618 0 6 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 556 0 6 48

Genuine Failures (failed on all attempts)

Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoContainText�[2m(�[22m�[32mexpected�[39m�[2m)�[22m failed

Locator: locator('[data-testid="default-persona-chip"] [data-testid="tag-chip"]').first()
Expected substring: �[32m"�[7mPW P�[27mersona 3�[7mb797077�[27m"�[39m
Received string:    �[31m"�[7mp�[27mersona 3�[7m3ce7448�[27m"�[39m
Timeout: 15000ms

Call log:
�[2m  - Expect "toContainText" with timeout 15000ms�[22m
�[2m  - waiting for locator('[data-testid="default-persona-chip"] [data-testid="tag-chip"]').first()�[22m
�[2m    19 × locator resolved to <div class="ant-col" data-testid="tag-chip">…</div>�[22m
�[2m       - unexpected value "persona 33ce7448"�[22m

🟡 17 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Glossary Term - customization should work (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Flow/PersonaDeletionUserProfile.spec.ts › User profile loads correctly before and after persona deletion (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set and remove default persona should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Spreadsheet (shard 4, 1 retry)
  • Pages/DomainAdvanced.spec.ts › User with domain access can view subdomains (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Create domains and add assets (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with data products attached at domain and subdomain levels (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Delete Container (shard 4, 1 retry)
  • Pages/Glossary.spec.ts › Column dropdown drag-and-drop functionality for Glossary Terms table (shard 6, 1 retry)
  • Pages/HyperlinkCustomProperty.spec.ts › should display URL when no display text is provided (shard 6, 1 retry)
  • Pages/Login.spec.ts › Refresh should work (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
  • VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants