I am currently doing a constrained optimization in RNN, where the weights are constrained with all entries non-negative. That is, at the end of each iteration, I need to call
self.weight = ReLU(self.weight)
but it will give an error:
TypeError: cannot assign 'torch.cuda.FloatTensor' as parameter 'weight' (torch.nn.Parameter or None expected)
There are two options now:
self.weight = nn.Parameter(ReLU(self.weight)), which would make
self.weight.grad None. Or I do:
self.weight.data = ReLU(self.weight), which would make the autograd not consider the gradient of ReLU. Is there any way to make sure the last call of ReLU in each iteration does have its gradient passed?
By the way, letting
w = ReLU(self.weight) and use
w for the subsequent computation does not work, because it will take the gradient of ReLU at the end of the backward, rather than at the beginning, which could be drastically different if I do some non-multiplicative operations in between.