I am a freshman in pytorch, and I want to know whether the two method below equal to each other.
batch size == 2 and the loss is calculated by the model, then call loss.backwards.
batch size == 1 and the loss is calculated every two step, then call loss.backwards.
Thank you very if you can do me favor and explain for me
The network usually accept any number of element in the batch and the loss functions average over the batch size.
So if you use a batch_size of 2 and backward or if you do twice batch size of 1 and backward each of them. You will be off by a factor of 2: the first one took the average while the second one took the sum.
Thank you very much! it did help me a lot. Now I can solve the problem without any worries then.