Skip to content

Conversation

@steveahnahn
Copy link
Contributor

Problem

The previous implementation fetched all task instances from the database and then filtered by map_index in Python. For DAGs with mapped tasks containing large map indices, this caused unnecessary database load and memory usage.

Solution

Push the map_index filter to the SQL query, allowing the database to handle filtering efficiently:

  • Move map_index filtering from Python to SQL in get_task_instance_states and get_task_instance_count endpoints
  • Add map_index parameter to _get_group_tasks helper function to filter at the database level

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Jan 9, 2026
@SameerMesiah97
Copy link
Contributor

This looks good at first glance. But as the filter based on map index is a new addition (and likely not to be covered by existing tests), I think it would be worth adding a small test to lock in the expected behavior and guard against future regressions.

In particular, it would be a good idea to cover:

  1. A mapped task within a task group that produces multiple task instances (i.e. multiple map indices for the same task ID).
  2. The default behavior when map_index is not provided, ensuring all relevant task instances are returned (including unmapped tasks).

For (1), you could test both code paths by passing and omitting task_group_id. If feasible (and if doesn't make the test too bulky), you could cover all scenarios in a single parametrized test.

@steveahnahn
Copy link
Contributor Author

Thanks for the review, the scenarios you mentioned are already covered in airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_task_instances.py:

  • test_get_count_mix_of_task_and_task_group_dynamic_task_mapping
  • test_get_task_states_mix_of_task_and_task_group_dynamic_task_mapping

The existing parametrized cases cover mapped tasks with multiple map indices, default behavior without map_index, and filtering with task_group_id. These tests will now exercise the new db-level filtering path.

I did end up adding one missed test combination, filtering by map_index without providing task_group_id

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants