Skip to content

Add a connection object cache #231

@augusto-herrmann

Description

@augusto-herrmann

Currently, only the connection id is passed on between DbToDbOperator, DbToDbHook, copy_db_to_db, etc. That means that every time some piece of code needs a connection, it has to retrieve the connection (hence accessing the Airflow database, or Hashicorp Vault if using that).

That happens multiple times within the scope of a single task, which generates a lot of overhead. That can be seen on task logs, where multiple events of INFO - Retrieving connection 'example_connection' for the same connection show up in the log.

We should implement some cache to the connection object so that it is reused between all those functions and is retrieved only the first time it is needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions