Packed_accessor32 stores copy in shared memory or global memory of GPU

Matthieu_Lin · February 18, 2021, 1:32pm

Hi,

I was wondering upon calling a kernel and giving as arguments packed_accessor32, where is the tensor copied ? on the global memory of the GPU, or does it dispatch it on the shared memory of each SM