Skip to content

[Issue]: slow large allocations on windows / strix halo #3471

@Epliz

Description

@Epliz

Problem Description

Hi,

As mentioned at #1786 , I have noticed that large allocations on Windows are very slow compared to Linux on a Strix Halo laptop configured with 96GB of VRAM (out of 128GB).

Test program:

#include <iostream>
#include <chrono>

#include <hip/hip_runtime.h>

int main() {
    // do a small allocation to start HIP
    void* init_ptr = nullptr;
    if (hipMalloc(&init_ptr, 1024) != hipSuccess) {
        std::cerr << "Failed to do initial hipMalloc" << std::endl;
        return -1;
    }
    if (hipFree(init_ptr) != hipSuccess) {
        std::cerr << "Failed to do initial hipFree" << std::endl;
        return -1;
    }

    // test if can allocate large amounts of gpu memory (72GB)
    size_t mem_size = (size_t)72 * 1024 * 1024 * 1024;
    void* test_ptr = nullptr;
    std::cout<< "Testing if can allocate " << mem_size << " bytes of GPU memory..." << std::endl;

    // measure the time it takes to allocate the memory
    auto start = std::chrono::high_resolution_clock::now();
    hipError_t err = hipMalloc(&test_ptr, mem_size);
    auto end = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - start);
    if (err != hipSuccess) {
        std::cerr << "Failed to allocate " << mem_size << " bytes of GPU memory: " << hipGetErrorString(err) << std::endl;
        return -1;
    }
    std::cout<< "Successfully allocated " << mem_size << " bytes of GPU memory in " << duration.count() << " microseconds" << std::endl;
    if (hipFree(test_ptr) != hipSuccess) {
        std::cerr << "Failed to free test memory" << std::endl;
        return -1;
    }
    return 0;
}

The big allocation takes ~15s on Windows, 700 microseconds on Linux.

I use ROCM 7.2 on Windows and Linux.

Thanks for any help.

Operating System

Windows

CPU

AMD RYZEN AI MAX+ PRO 395 w/ Radeon 8060S

GPU

Radeon 8060S

ROCm Version

7.2

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions