Skip to content

Comments

Support Julia 1.13 with fix for @device_functions macro#3031

Open
KSepetanc wants to merge 17 commits intoJuliaGPU:masterfrom
KSepetanc:eschnett/julia-1.13
Open

Support Julia 1.13 with fix for @device_functions macro#3031
KSepetanc wants to merge 17 commits intoJuliaGPU:masterfrom
KSepetanc:eschnett/julia-1.13

Conversation

@KSepetanc
Copy link

Closes #3019.

@eschnett asked me to create a new duplicate PR #3020 of his, but with fix for macro @device_functions. He couldn't test if the fix works as I made PR on his fork that does not have CI infrastructure.

@eschnett
Copy link
Contributor

Well, I didn't really ask for a duplicate PR. I suggested to either merge them into CUDA.jl as two separate, sequential PRs, or – if you want to merge them as a single PR into CUDA.jl – create such a PR. I don't care either way; using two separate PRs seems simpler, but I leave that choice up to you.

@KSepetanc
Copy link
Author

KSepetanc commented Feb 18, 2026

The way I see it, this is the second option, i.e. single PR with both changes.

@device_functions macro is broken due to changes in 1.13 so it makes sense to include it in PR for 1.13 support.

@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.35%. Comparing base (7a27d77) to head (aa08bb6).

Files with missing lines Patch % Lines
lib/nvml/NVML.jl 50.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3031      +/-   ##
==========================================
- Coverage   89.46%   89.35%   -0.12%     
==========================================
  Files         148      148              
  Lines       13047    13044       -3     
==========================================
- Hits        11673    11655      -18     
- Misses       1374     1389      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: aa08bb6 Previous: 7a27d77 Ratio
latency/precompile 43987599741.5 ns 44455759835 ns 0.99
latency/ttfp 13112185336 ns 13140153243 ns 1.00
latency/import 3768904795 ns 3755312424 ns 1.00
integration/volumerhs 9441187.5 ns 9442840 ns 1.00
integration/byval/slices=1 145953 ns 145598 ns 1.00
integration/byval/slices=3 423132 ns 422554 ns 1.00
integration/byval/reference 144068 ns 143811 ns 1.00
integration/byval/slices=2 284567 ns 284011 ns 1.00
integration/cudadevrt 102730.5 ns 102397 ns 1.00
kernel/indexing 13556 ns 13434 ns 1.01
kernel/indexing_checked 14190 ns 13908 ns 1.02
kernel/occupancy 656.4909090909091 ns 644.5636363636364 ns 1.02
kernel/launch 2165.4 ns 2090.3 ns 1.04
kernel/rand 14651 ns 14479 ns 1.01
array/reverse/1d 19132 ns 18661 ns 1.03
array/reverse/2dL_inplace 66383 ns 66252 ns 1.00
array/reverse/1dL 69365 ns 68893 ns 1.01
array/reverse/2d 20812.5 ns 21087 ns 0.99
array/reverse/1d_inplace 10603.166666666668 ns 10503.833333333332 ns 1.01
array/reverse/2d_inplace 11726 ns 11399.5 ns 1.03
array/reverse/2dL 72872 ns 73163 ns 1.00
array/reverse/1dL_inplace 66310 ns 66146 ns 1.00
array/copy 18407 ns 18502.5 ns 0.99
array/iteration/findall/int 145156.5 ns 146476.5 ns 0.99
array/iteration/findall/bool 130342 ns 130795 ns 1.00
array/iteration/findfirst/int 83970.5 ns 84133 ns 1.00
array/iteration/findfirst/bool 81352 ns 81624.5 ns 1.00
array/iteration/scalar 66441 ns 65804 ns 1.01
array/iteration/logical 196855 ns 198187.5 ns 0.99
array/iteration/findmin/1d 84288 ns 86504 ns 0.97
array/iteration/findmin/2d 116696 ns 117154 ns 1.00
array/reductions/reduce/Int64/1d 38815 ns 41088.5 ns 0.94
array/reductions/reduce/Int64/dims=1 42194.5 ns 52190.5 ns 0.81
array/reductions/reduce/Int64/dims=2 59158 ns 59179 ns 1.00
array/reductions/reduce/Int64/dims=1L 87315 ns 87126 ns 1.00
array/reductions/reduce/Int64/dims=2L 84580 ns 84418.5 ns 1.00
array/reductions/reduce/Float32/1d 34125.5 ns 34001 ns 1.00
array/reductions/reduce/Float32/dims=1 40912 ns 39890 ns 1.03
array/reductions/reduce/Float32/dims=2 56426.5 ns 55899 ns 1.01
array/reductions/reduce/Float32/dims=1L 51790 ns 51535 ns 1.00
array/reductions/reduce/Float32/dims=2L 70031 ns 69798 ns 1.00
array/reductions/mapreduce/Int64/1d 38957 ns 40980.5 ns 0.95
array/reductions/mapreduce/Int64/dims=1 51415 ns 41741 ns 1.23
array/reductions/mapreduce/Int64/dims=2 58881.5 ns 59036 ns 1.00
array/reductions/mapreduce/Int64/dims=1L 87459 ns 87134 ns 1.00
array/reductions/mapreduce/Int64/dims=2L 84594 ns 84427 ns 1.00
array/reductions/mapreduce/Float32/1d 33985 ns 33457 ns 1.02
array/reductions/mapreduce/Float32/dims=1 39625 ns 48711 ns 0.81
array/reductions/mapreduce/Float32/dims=2 56741 ns 55941 ns 1.01
array/reductions/mapreduce/Float32/dims=1L 51514 ns 51352 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 69443.5 ns 68956 ns 1.01
array/broadcast 20582 ns 20251 ns 1.02
array/copyto!/gpu_to_gpu 10606.333333333334 ns 10684.333333333334 ns 0.99
array/copyto!/cpu_to_gpu 213329 ns 214898 ns 0.99
array/copyto!/gpu_to_cpu 284108 ns 281876 ns 1.01
array/accumulate/Int64/1d 117753 ns 118336 ns 1.00
array/accumulate/Int64/dims=1 79494 ns 79780 ns 1.00
array/accumulate/Int64/dims=2 155312.5 ns 155968.5 ns 1.00
array/accumulate/Int64/dims=1L 1698684 ns 1694089 ns 1.00
array/accumulate/Int64/dims=2L 960326.5 ns 960949 ns 1.00
array/accumulate/Float32/1d 100444 ns 100823 ns 1.00
array/accumulate/Float32/dims=1 76182 ns 76350 ns 1.00
array/accumulate/Float32/dims=2 144330.5 ns 144365 ns 1.00
array/accumulate/Float32/dims=1L 1584804 ns 1584729 ns 1.00
array/accumulate/Float32/dims=2L 656644 ns 656302 ns 1.00
array/construct 1334.6 ns 1283.1 ns 1.04
array/random/randn/Float32 36132 ns 36610 ns 0.99
array/random/randn!/Float32 30242 ns 30335 ns 1.00
array/random/rand!/Int64 27134.5 ns 26934 ns 1.01
array/random/rand!/Float32 8286.333333333334 ns 8186.666666666667 ns 1.01
array/random/rand/Int64 36919 ns 30201.5 ns 1.22
array/random/rand/Float32 12506 ns 12396 ns 1.01
array/permutedims/4d 55908 ns 52729 ns 1.06
array/permutedims/2d 52565 ns 52645 ns 1.00
array/permutedims/3d 52683 ns 53080 ns 0.99
array/sorting/1d 2735353.5 ns 2736443 ns 1.00
array/sorting/by 3305047.5 ns 3305811 ns 1.00
array/sorting/2d 1068187 ns 1071655.5 ns 1.00
cuda/synchronization/stream/auto 983.6470588235294 ns 1034.5263157894738 ns 0.95
cuda/synchronization/stream/nonblocking 8144.2 ns 7705.9 ns 1.06
cuda/synchronization/stream/blocking 823.6262626262626 ns 784.4516129032259 ns 1.05
cuda/synchronization/context/auto 1133.7 ns 1133.5 ns 1.00
cuda/synchronization/context/nonblocking 8208.9 ns 7594.6 ns 1.08
cuda/synchronization/context/blocking 905.6326530612245 ns 885.6792452830189 ns 1.02

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot load CUDA.jl with Julia 1.13

2 participants