-
Notifications
You must be signed in to change notification settings - Fork 245
Description
Migrate Python/Cython bindings from device_memory_resource* to resource_ref / any_resource
Part of #2011. Continues the work scoped in #2209.
Problem
The Python/Cython layer is entirely coupled to device_memory_resource*:
DeviceMemoryResourcestoresshared_ptr[device_memory_resource]and exposes
get_mr() -> device_memory_resource*- All adaptor constructor declarations in
librmm/memory_resource.pxdtake
device_memory_resource*, even though the C++ headers already accept
device_async_resource_ref device_buffer.pxdconstructor declarations usedevice_memory_resource*per_device_resource.pxdonly declares the pointer-based
get_current_device_resource()/set_current_device_resource()APIs- 54 total references to
device_memory_resourceacross 7 Cython files - Zero references to
resource_ref,any_resource, or
device_async_resource_refin any.pxd/.pyxfile
This coupling blocks removal of device_memory_resource from C++. By migrating
Python first, subsequent C++ removal becomes a pure C++ change with no
cross-language coordination.
Goal
After this work:
DeviceMemoryResourcestores anany_device_resource
(any_resource<device_accessible>) instead of
shared_ptr[device_memory_resource]- All Cython
.pxddeclarations match the actual C++ signatures
(device_async_resource_ref, notdevice_memory_resource*) - The Python-side
set_per_device_resourcecalls the*_refC++ API - No
.pxdor.pyxfile referencesdevice_memory_resource - The Python user-facing API is unchanged (backward compatible)
device_memory_resource still exists in C++ and all resources still inherit from
it. We are only cutting the Python-side dependency.
Design
The DeviceMemoryResource base class is retained in Python. Its internal storage
changes from shared_ptr[device_memory_resource] to any_device_resource. Every
concrete resource class (CudaMemoryResource, PoolMemoryResource, etc.)
constructs its C++ resource and stores it as an any_device_resource.
any_resource<device_accessible> is an owning, type-erased, copyable CCCL type
that subsumes the role of both shared_ptr and shared_resource from Python's
perspective. No shared_resource_wrapper or other indirection is needed.
To pass resources into C++ APIs that accept device_async_resource_ref, the
any_resource converts implicitly (it supports conversion to resource_ref).
Cython Limitations
- Cython cannot stack-allocate C++ template classes without a verifiable
nullary constructor.any_device_resourceshould be declared via a C++ typedef
(e.g.,using any_device_resource = cuda::mr::any_resource<cuda::mr::device_accessible>)
and wrapped in Cython as an opaque type, or stored behindunique_ptrif needed. - Resource refs must be constructed inline at call sites to avoid Cython's
nullary constructor requirement.
Tasks
1. Add CCCL type declarations to Cython .pxd files
Add cdef extern declarations for:
device_async_resource_ref(fromrmm/resource_ref.hpp)any_device_resourcetypedef for
cuda::mr::any_resource<cuda::mr::device_accessible>
Files:
python/rmm/rmm/librmm/memory_resource.pxdpython/rmm/rmm/librmm/per_device_resource.pxd
2. Update Cython .pxd declarations to match actual C++ signatures
The C++ adaptor constructors already take device_async_resource_ref, not
device_memory_resource*. Update the Cython declarations to be truthful.
Files and changes:
python/rmm/rmm/librmm/memory_resource.pxd-- change all adaptor constructor
parameters fromdevice_memory_resource*todevice_async_resource_ref
(pool, arena, fixed_size, binning, limiting, logging, statistics, tracking,
failure_callback, prefetch, aligned, thread_safe, callback)python/rmm/rmm/librmm/device_buffer.pxd-- changedevice_buffer
constructor parameters fromdevice_memory_resource*to
device_async_resource_refpython/rmm/rmm/librmm/device_uvector.pxd-- updatememory_resource()
return typepython/rmm/rmm/librmm/per_device_resource.pxd-- add*_reffunction
declarations (set_per_device_resource_ref,
get_current_device_resource_ref, etc.)
3. Migrate DeviceMemoryResource storage to any_device_resource
Files:
python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pxd-- change
c_objfromshared_ptr[device_memory_resource]toany_device_resource;
replaceget_mr()with a method returningdevice_async_resource_refpython/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pyx-- update all
__cinit__methods,allocate(),deallocate(), and per-device resource
functions
Construction pattern changes from:
self.c_obj.reset(new cuda_memory_resource())to:
self.c_obj = any_device_resource(cuda_memory_resource())Allocation changes from:
self.c_obj.get().allocate(stream.view(), nbytes)to calling allocate through the any_device_resource interface.
4. Update device_buffer.pyx
Pass device_async_resource_ref (obtained from the any_device_resource) to
device_buffer constructors instead of device_memory_resource*.
File: python/rmm/rmm/pylibrmm/device_buffer.pyx
5. Switch per-device resource Python API to *_ref C++ functions
Call set_per_device_resource_ref() / set_current_device_resource_ref()
instead of the pointer-based variants.
File: python/rmm/rmm/pylibrmm/memory_resource/_memory_resource.pyx
6. Remove all device_memory_resource references from Python
- Remove
device_memory_resourcebase class declarations from.pxdfiles - Remove
device_memory_resourcecimports - Remove pointer-based per-device-resource declarations from
per_device_resource.pxd
Validation
build-rmm-pythonsucceeds- All Python tests pass (
test-rmm-python) - No
.pxdor.pyxfile containsdevice_memory_resource
References
- Use resource refs in Cython #2209 (draft PR with most of this work scoped)
- Refactor Cython to use resource_ref instead of
device_memory_resource*#1500 (original issue: refactor Cython to use resource_ref) - [FEA] Support memory resources from CCCL 3.2 #2011 (parent tracking issue)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status