Fuse `concat` with `reduce_by_index`

This optimisation is conceptually very similar to how `concat` is fused into `scatter`. This is important for algorithms that produce many points to be reduced per item. A fairly minimal example of something that should fuse but currently does not is
```
entry test n =
    let (is, vs) = tabulate n (\i -> ((i, (i+1) % n), (1f32, 1f32))) |> unzip
    let unzip_concat as = unzip as |> uncurry concat
    in hist (+) 0f32 n (unzip_concat is) (unzip_concat vs)
```
here the lack off fusion is not too problematic, as the number of items to reduce scales with the array, but once the number of items to reduce increases beyond that there can be some really bad scaling issues, as `(is, vs)`  needs to be manifested due to the lack of fusion. Unlike `scatter` where the number of points to be scattered are usually less than or equal to the size of the destination array, in `reduce_by_index` this is often not the case.

I have some code used to transfer point-value pairs a onto coarser grid using polynomial interpolation, and this produces many items to reduce per point to be transferred, especially in higher dimensions. This is necessary for implementing Particle Mesh Ewald (PME) summation which is a common method in Molecular Dynamics (MD) and more generally for solving linear partial differential equations with complex geometry.

As an example of the scaling issues, this is the results for 3d cubic interpolation where each point is reduced to 4x4x4 bins with different weights. The grid size is kept constant, only the number of points to be reduced is changed. 

```
grid_test.fut:cubic_bench (no tuning file):
[3][100000]i32:         4475μs (95% CI: [    4464.6,     4510.0])
[3][1000000]i32:       40286μs (95% CI: [   39761.9,    42289.3])
[3][10000000]i32:   69413528μs |███       |  3 runs ^C
```

The smaller systems are also slower  than expected, but as the array that needs to be manifested grows, the function becomes unusably slow.

In the worst case here, the intermediate data, that would be fused away, is far larger than the input data or result grid.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuse `concat` with `reduce_by_index` #2418

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fuse concat with reduce_by_index #2418

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Fuse `concat` with `reduce_by_index` #2418