Should i do loss.backward() or loss.mean().backward()

Just FYI, we can also use .mean().backward(), whenever we define the loss inside our nn.model.
It usually being used for the cases that we have multiple gpus and dont want one of the GPUs be unbalance in terms of memory.
More can be found here

3 Likes