-
Notifications
You must be signed in to change notification settings - Fork 979
Open
Open
Copy link
Description
Summary
Propose adding multi-GPU data-parallel encoding to QDP so users can scale quantum state preparation across multiple GPUs. Currently, QdpEngine only supports a single device_id, which limits throughput for large batches and high qubit counts (e.g., 20+ qubits).
Motivation
- Current limitation:
QdpEngine::new(device_id: usize)accepts only one GPU (qdp-core/src/lib.rs). - Use case: High-qubit encoding (20+ qubits) and large batches hit single-GPU memory and compute limits. Multi-GPU parallel encoding can increase throughput proportionally.
- Alignment with PR [QDP] Add a Quantum Data Loader and API refactor #1000: The Quantum Data Loader (PR [QDP] Add a Quantum Data Loader and API refactor #1000) provides batch-by-batch iteration; multi-GPU support would allow batches to be distributed across GPUs for an end-to-end, high-throughput pipeline.
Proposed Design
- Batch routing: Distribute batches across GPUs (e.g., round-robin or workload-aware).
- Result aggregation: Merge outputs from each GPU into a single DLPack tensor (or keep a distributed representation for downstream use).
- Stream management: Each GPU uses its own CUDA stream to avoid synchronization bottlenecks.
Scope
qdp-core (Rust)
- Add a multi-GPU engine abstraction (e.g.,
QdpEnginePool) to manage multipleQdpEngineinstances. - Implement
encode_batch_distributedto split batches across GPUs or assign different batches to different GPUs. - Use
rayonorstd::threadfor CPU-side coordination.
qdp-python
- Expose a new API (e.g.,
QdpEngine(device_ids=[0, 1, 2])orMultiGpuEngine). - Integrate with the Quantum Data Loader once PR [QDP] Add a Quantum Data Loader and API refactor #1000 is merged.
- Preserve backward compatibility when a single device is specified.
Non-Goals (out of scope)
- Multi-GPU model parallelism or tensor parallelism within a single encoding operation.
- Automatic GPU selection or load balancing in the first version (can be added later).
Reactions are currently unavailable