Difference between torch.sum(x, args) and x.sum when x is tensor

It seems that torch.sum(x, keepdim=True, dim=...) provides more control over x.sum(), when x is tensor.

However, which form is more appropriate to be used in customized loss functions that are used in backpropagation?

And, how about speed?


As of latest version, they actually have exactely the same arguments. They will do the exact same thing with the autograd.