In many blogpost and discussion section it is said that by default F.backward() is equivalent as F.backward(gradient=torch.Tensor([1.])). But looking at the implementation of tensor.py and autograd.backward() implementation the default value for external gradient is none i.e
These blogposts should really use torch.ones_like(loss) (but only for scalar-valued loss) even when ignoring the memory layout to keep things simple. torch.autograd.backward will call an auxiliary function _make_grads that creates those if they are not passed in.
Having a None formal default argument and then somehow computing the “effective default” is a common pattern in Python much more generally applicable than just PyTorch. Sometimes - like here - it is because the default isn’t fixed (depends on the device, memory layout) or because the default is mutable and so you want to instantiate a new object every time the function is called (blog posts on “mutable default arguments” seem to be very common, too).
I’m guessing “equivalent” means “gives you the same result as” and it is a bit weaker than equal which is more “does literally the same thing”.