I have implemented a function which takes a tensor of size (batch_size x width x height) as input, and returns a tensor of size (batch_size x 1 x 1).
What I would like to do is apply this “convolutionally” to a tensor - in other words, sliding a window across the input tensor, and applying the function to each of these windows in turn, producing an output which (with padding) is the same size as the input tensor.
Is there any reason that prevents you from using nn.Conv2d? I don’t know the exact purpose of your function but at the first glance, I think nn.Conv2d can do the same thing for you.
@KaiyangZhou Unfortunately, nn.Conv2d doesn’t work in this case. In essence, nn.Conv2d runs each window of the input through a fully connected linear layer, but my function works quite differently to this.
Perhaps a better way of explaining what I’m trying to do is that I’m trying to find a better way of implementing the following pseudocode:
def convolutional_function(input):
for x in range(width):
for y in range(height):
index window of input
intermediate_output = myfunction(indexed_input)
concatenate intermediate inputs
Thanks very much for your help. In answer to your questions:
It’s “something like x:x+width etc.” - It’s indexing a portion of the input. In fact, I think it would be more akin to x-0.5*kernel_size:x+0.5*kernel_size
Yes (or a tensor of size(number of batches x 1 x 1))
The last line isn’t really correct. My point was just that the scalar outputs need to be joined together into a 2d tensor
BTW: the pseudocode does not account for padding, but this would be necessary to ensure input and output sizes of the function are equal
Ok, so as far as I understand, the operation is equal to a convolution regarding the size, stride etc., but instead of the dot product, you would like to perform some other operation.