Hip device says only has 4096 bytes global memory, but pytorch wants to allocate 2097152 bytes. Why?

Hi, I’m designing a GPU in verilog, GitHub - hughperkins/VeriGPU: OpenSource GPU, in Verilog, loosely based on RISC-V ISA, and I started dabbling in running it, in simulation, from pytorch, using the HIP API.

I’ve started to therefore implement the HIP API, VeriGPU/hip_api.cpp at f4279efa308638e7218dd7865d987adaf6c2cc77 · hughperkins/VeriGPU · GitHub , and when I run a single pytorch script like:

import torch

a = torch.rand(3)
a.cuda()

And then the output when I run with my api so far looks like:

hipInit
hipGetDeviceCount
hipGetDeviceCount
hipGetDeviceProperties
hipGetDeviceCount
hipGetDevice
hipGetDevice
hipGetDevice
hipGetDevice
hipGetLastError
hipMalloc size=2097152
hipSetDevice
hipSetDevice
Traceback (most recent call last):
  File "/home/ubuntu/git/verigpu/verigpu/test_hip.py", line 4, in <module>
    a.cuda()
RuntimeError: Not enough free space

So, pytorch is trying to allocate 2MB of memory, even though in hipGetDeviceProperties, I’m clearly stating my device only has 4096 bytes of memory, VeriGPU/hip_api.cpp at f4279efa308638e7218dd7865d987adaf6c2cc77 · hughperkins/VeriGPU · GitHub .

    prop->totalGlobalMem = 4096;

Why is pytorch asking for 2 meg of memory, when I clearly state I only have 4096 bytes?