What is different?

doojin · July 8, 2022, 4:52am

Hi,

What is different below two case? Is it the difference between being a leaf tensor or not? Or is there any other difference?

I don’t know what it is, but I think both of them can calculate gradient. (If not, let me know)

case1)

a = torch.randn(3,4,4, device='cuda', requires_grad=True)

case2)

a = torch.randn(3,4,4).cuda().requires_grad_(True)

Actually, I’m more curious about what kind of situation the tensor is after network learning.

Thank you for your time.

ptrblck · July 8, 2022, 5:00am

Both are leaf tensors as seen by print(a.is_leaf).
The difference is the creation. The first tensor will be created directly on the GPU with the desired flags, the second one will be sampled on the CPU, pushed to the GPU, and then its requires_grad attribute will be changed inplace.

I don’t understand the question, as both tensors are equal besides their chain of operations during the creation.

doojin · July 8, 2022, 5:06am

Thank you very much for your reply.
My last question is explained in detail as follows.
When I input an image into the network. After the input passes through the network, the loss function is calculated. At this time, the question is whether the situation of the tensor used in the calculation corresponds to the above case.

ptrblck · July 8, 2022, 5:32am

No, the model output will not be a leaf tensor as it’s computed by the input as well as the parameters and will thus have a history (i.e. it’s attached to a computation graph).

doojin · July 8, 2022, 6:52am

I really want to thank you for your help!

ptrblck · July 8, 2022, 6:53am

Sure, happy to help. Let me know if you have more questions or my answers weren’t detailed enough (or didn’t fit your use case).