-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
It's weird because x = CUDA.zeros(1,4) works fine but x = transpose(CUDA.zeros(4)) gives the scalar indexing problem.
MWE:
using KernelAbstractions, CUDA
import DifferentiationInterface as DI
@kernel function foo!(y, x)
i = @index(Global)
a = 2*i - 1
b = 2*i
offset = (i-1)*4
y[a] = (offset+1)*x[a] + (offset+2)*x[b]
y[b] = (offset+3)*x[a] + (offset+4)*x[b]
end
kernel! = foo!(CUDA.CUDABackend())
f!(y,x) = kernel!(y, x, ndrange=2)
# This works fine:
x = CUDA.zeros(1,4)
y = CUDA.rand(1,4)
prep = DI.prepare_jacobian(f!, y, DI.AutoForwardFromPrimitive(DI.AutoForwardDiff()), x);
DI.value_and_jacobian!(fun!, y, jac, prep, DI.AutoForwardFromPrimitive(DI.AutoForwardDiff()), x)
#= Output:
4×4 CuArray{Float32, 2, CUDA.DeviceMemory}:
1.0 2.0 0.0 0.0
3.0 4.0 0.0 0.0
0.0 0.0 5.0 6.0
0.0 0.0 7.0 8.0
=#
# This causes scalar indexing:
x = transpose(CUDA.zeros(4))
y = transpose(CUDA.rand(4))
prep = DI.prepare_jacobian(f!, y, DI.AutoForwardFromPrimitive(DI.AutoForwardDiff()), x);
DI.value_and_jacobian!(fun!, y, jac, prep, DI.AutoForwardFromPrimitive(DI.AutoForwardDiff()), x)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels