How to define the video inputs for pytorch with vulkan backend?

I have compiled the pytorch with vulkan backend in Android and successfully transferred a video action recoginzer model “Video Swin transformer” into .pt with vulkan backend.
When I define the input like:

FloatBuffer buffer = Tensor.allocateFloatBuffer(1 * 3 * 224 * 224);
Tensor inputTensor = Tensor.fromBlob(buffer, new long[]{1, 3, 224, 224});
TensorImageUtils.TORCHVISION_NORM_STD_RGB, MemoryFormat.CHANNELS_LAST);
assert module != null;
final Tensor outputTensor = module.forward(IValue.from(inputTensor)).toTensor();

I got the error like:

com.facebook.jni.CppException: Dimension out of range (expected to be in range of [-4, 3], but got 4)

I also tried the input like:

FloatBuffer buffer = Tensor.allocateFloatBuffer(1 * 3 * 10 * 224 * 224);
Tensor inputTensor = Tensor.fromBlob(buffer, new long[]{1, 3, 10, 224, 224});
TensorImageUtils.TORCHVISION_NORM_STD_RGB, MemoryFormat.CHANNELS_LAST);
assert module != null;
final Tensor outputTensor = module.forward(IValue.from(inputTensor)).toTensor();

And the error is:

Only Tensors with 1 <= dim <= 4 can be represented as a Vulkan Image!

I really don’t what to do now. Could anyone do me a favor?

I use pytorch1.13 with vulkanSDK==1.3.204.1

I have figured out the problem here. And the right definition of video input should be the same as PC.
However, as vulkan will transfer the input tensor as Vulkan Image, and Vulkan Image only supports 3d images, the video tensor with shape [batch, channels, frames, height, width] is not supported.