"Convolve" a custom module/model on a tensor (forward a module on segments of array with striding)

mikolchon · December 19, 2019, 11:58am

Say I have array 1,2,3,4,5,6,7,8,9,0. I wish to “convolve” a custom module on the array. In other words, I want to forward a custom module on segments of the array with certain stride. For example, I may have a neural network and want to forward the above array taking input segments of length 3 and stride 2: on 1,2,3, then 3,4,5, then 5,6,7, … then the final output would be the concatenation of the per-segment outputs.

I know one can easily implement this manually, but I want to precisely avoid the concatenation as this creates a copy. I was hoping there is a functionality in Pytorch where I can pass a custom module (instead of a linear filter) and perform the “convolution”, returning the already contiguous array.

ptrblck · December 19, 2019, 7:25pm

You could use tensor.unfold to create these patches, but you would still need to allocate the memory for the output.
If you want to return a contiguous tensor, it has to be created at some point, even in some internal PyTorch functions.
Let me know, if I misunderstood your question.

mikolchon · December 21, 2019, 8:12am

Thank you very much. I think this is what I was looking for. But I am a bit confused by the provided example:

>>> # Convolution is equivalent with Unfold + Matrix Multiplication + Fold (or view to output shape)
>>> inp = torch.randn(1, 3, 10, 12)
>>> w = torch.randn(2, 3, 4, 5)
>>> inp_unf = torch.nn.functional.unfold(inp, (4, 5))
>>> out_unf = inp_unf.transpose(1, 2).matmul(w.view(w.size(0), -1).t()).transpose(1, 2)
>>> out = torch.nn.functional.fold(out_unf, (7, 8), (1, 1))
>>> # or equivalently (and avoiding a copy),
>>> # out = out_unf.view(1, 2, 7, 8)
>>> (torch.nn.functional.conv2d(inp, w) - out).abs().max()
tensor(1.9073e-06)

In this example, I’m not sure I understand why torch.nn.functional.fold is equivalent to using view. In this example out_unf is not contiguous (out_unf.is_contiguous() is False) because of the last transpose. My questions are:

a) In what cases are they equivalent?
b) Shouldn’t .view(...) raise an error since out_unf is not contiguous?: RuntimeError: input is not contiguous

ptrblck · December 21, 2019, 8:20pm

Generally, a dimension cannot span across two contiguous subspaces.
.view will not necessarily raise an error, if the desired shape does not violate the first assumption.
Also, note that out is not contiguous in this case either.

mikolchon · January 8, 2020, 11:21am

Thanks for the clarification.

Does unfold create a copy of the original array? If so I don’t think I will be able to use it due to my limited memory. My aim is to evaluate some module/model on an input array in a sliding window fashion. I could do this with a loop, then concatenate at the end. For example:

suboutputs = []
for i in range(x_length - window_length):
  suboutputs.append(model(x[i:i+window_length])
output = torch.cat(suboutputs)  # creates a copy

Is there a way to avoid the concatenation?