Howdy! I am relatively fresh to channels last memory format. I’ve spent time reading around, and I don’t think my question has been answered yet.
I am doing some object detection: (a batch of) grayscale image(s) in, to a (batch of) tensor(s) of shape (prediction dim, H, W)
out, where H
and W
specify the “prediction grid” shape, and prediction dim
has bounding box and class information.
With results from a batch of images, I naturally have to do a lot of indexing along the prediction grid (i.e. I want to know the result at H = 12, W = 15). Right now, I index the result tensor like this: res[:, 12, 15]
. However, I think this is probably slower than if I had the prediction dim
last, like res[12,15,:]
. So in a sense, I want channels last.
I get the result tensor from a conv net. Right now, if setting memory_format
to channels_last
also set the shape of the tensor, I could then gladly index the result in the “channels last” format like above. However, channels_last
simply sets the stride.
Is it possible to manually set the shape as I desire (e.g. a permute of the input image into channels last) and override the designation of the memory format to channels last? If I do a permute and then to(channels_last)
, the stride changes, and I don’t quite get what I want.
Thanks!