It has to do with the creation of the computation graph. After the mean operation, the tensor is basically just “remembering” the function that created it so that we have a complete history of computation. You’re not gonna want to change grad_fn.
It’s not a bad thing at all. It’s the process of which gradients are created. You don’t really want to be messing with it. It’s more like you’re getting a new tensor each time.