Change batch size (during forward pass)?

Hi all,

First of all, thank to all of you guys at Pytorch.
It’s a great tool with outstanding documentation!

I came across this problem while working on different “feedings” in seq2seq models (I like “Professor Forcing” approach (paper), but would like skip an expensive GAN).
I have an input tensor with size (T, B, N) and need to output a tensor with size (T, 3*B, N) which is the concatenation of 3 different methods of the forward pass (different encoder-decoder pairs with different feeding mechanism in the decoder part, very similar to the seq2seq tutorial), something like:

def forward(self, input):
    out1 = self.first_method(input) # (T, B, N)
    out2 = self.second_method(input) # (T, B, N)
    out3 = self.third_method(input) # (T, B, N)
    out =, out2, out3), dim=1)
    return out

My target tensor (used to calculate loss) is also with size (T, 3*B, N) and I use no BatchNorm or any “batch_” dependent layer operation.
What is the expected behaviour of autograd in this case?
I’m not entirely sure this ca work as it would need to backprop the loss from an object with batch size 3B into an input with batchsize B … should this be the case, any recommended trick?

Thanks in advance!

The best solution, as usual, is just to check it in practice (whether it works or not) :slight_smile:
But as I understand mechanics, after loss calculation you will get a scalar value that will be backpropagated. And I suppose it shouldn’t care about the difference of input and output batch sizes.