Loading big input tensor to GPU on Android


In an Android program, I want to run a model optimized for Vulkan. The main issue I encounter is that my input tensor seem to exceed some limit, I seem to be unable to load a tensor bigger than {1, 4, 2048, 2048}.
The 2048 seem to be the limit maxImageDimension3D of my Vulkan device, however I cannot find hard data on the 4 channel limit.

Here’s the loading code I have

auto tensor_a = at::rand({1, channels, 2048, 2048}, at::device(at::kCPU).dtype(at::kFloat));
auto tensor_a_vulkan = tensor_a.vulkan();

If “channels” is anything higher than 4 (or the image sizes is bigger), it will result in:

terminating with uncaught exception of type c10::Error: VkResult:-2
Exception raised from buffer at <prefix>pytorch/aten/src/ATen/native/vulkan/api/Resource.cpp:479 (most recent call first):
(no backtrace available)

There is still memory available on the device, I can safely allocate more tensors on the CPU and load them to Vulkan with no issues (as long as the individual dimensions are below the apparent limits) .

Is this a known limitation? Is there any way to work around it?



This seems to be a limitation with the Vulkan API, unfortunately there isn’t any workaround for it. Although there may technically be enough memory on the device to store the tensor data, whatever propriety technique used to represent textures on the device probably can’t handle textures of that size. A solution would involve using multiple textures to represent one tensor if the dimensions exceed the limit, which we don’t plan to implement at the moment.

Just curious, what GPU/device are you using?

Thanks for the reply. However, how does this work for tensors inside a model? Do only models only using tensors smaller than 4 x maxImageDimension3D x maxImageDimension3D throughout all their layers work on Vulkan at the moment?

This was tested on a device with Snapdragon S865 (GPU Adreno 650).

Would you have any additional information regarding the internal layers of models loaded to Vulkan GPU with regard to these limits?