Calculating gradients w.r.t another definied parameter


I have a model and it’s weight are a.W1 + b.W2. Where W1 and W2 are let’s say two randomly generated weights. I want to freeze W1, W2 and only update a, and b through the training. I’m wondering how can I implement this in Pytorch?

Thanks a lot,

Hi Mahdi!

Let me assume that W1, W2, a, and b are some sort of pytorch
tensors (and that your “.” represents some sort of multiplication).

Simply set

W1.requires_grad = False
W2.requires_grad = False
a.requires_grad = True
b.requires_grad = True

When you backpropagate, pytorch will only calculate gradients
for a and b, treating W1 and W2 as fixed (not trained) parameters.


K. Frank