Thank you for your reply. If I have to build some tensor in forward
function, how could I avoid the conflict between cpu tensor and gpu tensor after setting model.cuda()
?
In this thread, apaszke just recommends to use nn.parameter
. However, in the discussion above, you said
creates a new parameter (which won’t be optimized, as it’s depending on the input x, is recreated in each iteration, and is thus unknown to the optimizer), which will also detach the operation from the computation graph
For me, a beginner on Pytorch, it’s somehow confusing. I can understand your comments on the effects of nn.parameter
in forward
function (optimizer can not work on it), then what is the right way to build a tensor in the forward
function where the tensor is related with both the weight parameter and the output in the same time?
Thank you!