How can I apply this convolution?

I have a 6-dim tensor say [1,1,4012,6034,11,11], how can I apply the convolution along the last two dims? Excepting the unfold operation which will cause OOM

I assume this means you want to use a 2D convolution and treat the last two dims of your input as spatial sizes?
If so, what would the other dimensions represent and how would you treat them?
Usually an nn.ConvXD layer expects an input in [batch_size, channels, *dims] where *dims would correspond to 2 spatial dimensions (height and width), 3 volumetric dimensions (depth, height, width), etc.
In your use case it seems the spatial dimensions are fixed while other (undefined) dimensions are added, so could you explain your use case a bit more?

Thank you for your reply. What I want to do is to split a tensor into sliding windows, then apply the window sum in these sliding windows.

For example, I unfold an [1,1,4022,6044] image as [1,1,4012,6034,11,11], then in each 11x11 window, I want to apply summation on 3x3 window, although I can further unfold the last two dims, the OOM will occur.

Hi Wayne!

If I understand your use case correctly, you want to apply a 3x3 convolution
to each of the 11x11 patches spanned by your last two dimensions.

If this is right, you can use view() (or reshape()) to push your two large
dimensions into the “batch” dimension (keeping a singleton “channels”
dimension) and then apply an ordinary Conv2d:

>>> import torch
>>> torch.__version__
>>> _ = torch.manual_seed (2022)
>>> conv = torch.nn.Conv2d (1, 1, 3, padding = 1)
>>> t = torch.randn (1, 1, 412, 634, 11, 11)   # smaller tensor for this example
>>> u = conv (t.view (261208, 1, 11, 11)).view (1, 1, 412, 634, 11, 11)
>>> u.shape
torch.Size([1, 1, 412, 634, 11, 11])

I’m pretty sure that for large tensors Conv2d is smart enough to use a
algorithm more memory-efficient than calling unfold().


K. Frank

Excellent!! This should work in my case, thank you