Both are leaf tensors as seen by print(a.is_leaf).
The difference is the creation. The first tensor will be created directly on the GPU with the desired flags, the second one will be sampled on the CPU, pushed to the GPU, and then its requires_grad attribute will be changed inplace.

I don’t understand the question, as both tensors are equal besides their chain of operations during the creation.

Thank you very much for your reply.
My last question is explained in detail as follows.
When I input an image into the network. After the input passes through the network, the loss function is calculated. At this time, the question is whether the situation of the tensor used in the calculation corresponds to the above case.

No, the model output will not be a leaf tensor as it’s computed by the input as well as the parameters and will thus have a history (i.e. it’s attached to a computation graph).