-
Notifications
You must be signed in to change notification settings - Fork 148
Open
Labels
Description
Problem Description
Hi,
As mentioned at #1786 , I have noticed that large allocations on Windows are very slow compared to Linux on a Strix Halo laptop configured with 96GB of VRAM (out of 128GB).
Test program:
#include <iostream>
#include <chrono>
#include <hip/hip_runtime.h>
int main() {
// do a small allocation to start HIP
void* init_ptr = nullptr;
if (hipMalloc(&init_ptr, 1024) != hipSuccess) {
std::cerr << "Failed to do initial hipMalloc" << std::endl;
return -1;
}
if (hipFree(init_ptr) != hipSuccess) {
std::cerr << "Failed to do initial hipFree" << std::endl;
return -1;
}
// test if can allocate large amounts of gpu memory (72GB)
size_t mem_size = (size_t)72 * 1024 * 1024 * 1024;
void* test_ptr = nullptr;
std::cout<< "Testing if can allocate " << mem_size << " bytes of GPU memory..." << std::endl;
// measure the time it takes to allocate the memory
auto start = std::chrono::high_resolution_clock::now();
hipError_t err = hipMalloc(&test_ptr, mem_size);
auto end = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
if (err != hipSuccess) {
std::cerr << "Failed to allocate " << mem_size << " bytes of GPU memory: " << hipGetErrorString(err) << std::endl;
return -1;
}
std::cout<< "Successfully allocated " << mem_size << " bytes of GPU memory in " << duration.count() << " microseconds" << std::endl;
if (hipFree(test_ptr) != hipSuccess) {
std::cerr << "Failed to free test memory" << std::endl;
return -1;
}
return 0;
}
The big allocation takes ~15s on Windows, 700 microseconds on Linux.
I use ROCM 7.2 on Windows and Linux.
Thanks for any help.
Operating System
Windows
CPU
AMD RYZEN AI MAX+ PRO 395 w/ Radeon 8060S
GPU
Radeon 8060S
ROCm Version
7.2
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
Reactions are currently unavailable