Hi,
.backward()
is not aware of the concept of “samples” so it will just compute the gradients for what is given to it.
It just happens that in general, what we give is the mean of the loss of the samples. And so what it computes is the mean of the gradients for each sample.
If your loss contains only the mean of the losses for a given set of samples, then you will get the average gradient from these samples.