I have a model and it’s weight are a.W1 + b.W2. Where W1 and W2 are let’s say two randomly generated weights. I want to freeze W1, W2 and only update a, and b through the training. I’m wondering how can I implement this in Pytorch?
Thanks a lot,
Let me assume that
b are some sort of pytorch
tensors (and that your “.” represents some sort of multiplication).
W1.requires_grad = False
W2.requires_grad = False
a.requires_grad = True
b.requires_grad = True
When you backpropagate, pytorch will only calculate gradients
W2 as fixed (not trained) parameters.