Skip to content

Implement parallel cuda::std::copy_if#7518

Open
miscco wants to merge 1 commit intoNVIDIA:mainfrom
miscco:parallel_copy_if
Open

Implement parallel cuda::std::copy_if#7518
miscco wants to merge 1 commit intoNVIDIA:mainfrom
miscco:parallel_copy_if

Conversation

@miscco
Copy link
Contributor

@miscco miscco commented Feb 5, 2026

This implements the copy_if algorithm for the cuda backend.

It provides tests and benchmarks similar to Thrust and some boilerplate for libcu++

The functionality is publicly available yet and implemented in a private internal header

Fixes #7517

@miscco miscco requested review from a team as code owners February 5, 2026 18:52
@github-project-automation github-project-automation bot moved this to Todo in CCCL Feb 5, 2026
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Feb 5, 2026
@github-actions

This comment has been minimized.

Copy link
Contributor

@pciolkosz pciolkosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise same feedback as in #7513

__stream.get());

{
__allocation_guard<_OffsetType, decltype(__resource)> __guard{__stream, __resource, __num_bytes};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Can't this just be a cuda::buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not, the buffer owns the resource, we only want a reference

Copy link
Collaborator

@jrhemstad jrhemstad Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can't be right. If we designed cuda::buffer in a way that it can't be used for temporary allocation like this, then something fundamental is wrong. Why can't I pass a resource_ref into the buffer ctor?

@miscco miscco changed the title Implement parallel cuda::std::copy_if Implement parallel cuda::std::copy_if Feb 16, 2026
@miscco miscco force-pushed the parallel_copy_if branch 2 times, most recently from a77c254 to 4e79b30 Compare February 16, 2026 13:43
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@miscco miscco force-pushed the parallel_copy_if branch 2 times, most recently from 8f8b8a2 to c2bd9bf Compare February 17, 2026 10:06
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

This implements the copy_if algorithm for the cuda backend.

* std::copy_if see https://en.cppreference.com/w/cpp/algorithm/copy_if.html

It provides tests and benchmarks similar to Thrust and some boilerplate for libcu++

The functionality is publicly available yet and implemented in a private internal header

Fixes NVIDIA#7517
@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 1h 10m: Pass: 86%/95 | Total: 15h 55m | Max: 42m 55s | Hits: 98%/214832

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

[FEA]: Implement CUDA backend for parallel cuda::std::copy_if

4 participants