Skip to content

Accumulators which don't implement retract_batch can still exhibit buggy behaviour #19612

@Jefffrey

Description

@Jefffrey

Describe the bug

Originally from this discussion: #19278 (comment)

It turns out aggregate functions can still be used with window frames even if they don't implement retract_batch, if the window frame is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. And if the accumulator consumes the internal state as part of evaluate, it can return incorrect results.

To Reproduce

In aggregate.slt, try this:

query ITRRR
SELECT
    timestamp,
    tags,
    value,
    median(value) OVER (
        PARTITION BY tags
        ORDER BY timestamp
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ),
    percentile_cont(value, 0.5) OVER (
        PARTITION BY tags
        ORDER BY timestamp
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    )
FROM median_window_test
ORDER BY tags, timestamp;
----
1 tag1 10 10 10
2 tag1 20 15 15
3 tag1 30 20 20
4 tag1 40 25 25
5 tag1 50 30 30
1 tag2 60 60 60
2 tag2 70 65 65
3 tag2 80 70 70
4 tag2 90 75 75
5 tag2 100 80 80

We would expect those last two columns to be the same. However the actual results are:

query ITRRR
SELECT
    timestamp,
    tags,
    value,
    median(value) OVER (
        PARTITION BY tags
        ORDER BY timestamp
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ),
    percentile_cont(value, 0.5) OVER (
        PARTITION BY tags
        ORDER BY timestamp
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    )
FROM median_window_test
ORDER BY tags, timestamp;
----
1 tag1 10 10 10
2 tag1 20 15 20
3 tag1 30 20 30
4 tag1 40 25 40
5 tag1 50 30 50
1 tag2 60 60 60
2 tag2 70 65 70
3 tag2 80 70 80
4 tag2 90 75 90
5 tag2 100 80 100
  • See last column; it exactly follows the input data because it keeps discarding its state

Expected behavior

I've identified the following aggregate functions which likely exhibit this behaviour:

  • percentile_cont
  • string_agg
  • distinct median

median was fixed by #19278

We need to fix these accumulators to not consume the internal state on evaluate. We also would want to update the documentation around accumulators to make this behaviour obvious so we don't fall into the same trap later.

Additional context

Comment on why consuming internal state is bad for windows: #19278 (comment)

Metadata

Metadata

Labels

bugSomething isn't working

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions