Model parameters as sum of variable and constant weight

paulhofman · May 6, 2021, 8:32am

Hello everyone,
I am trying to train a model that uses parameters that are a sum of two sets weights and I want to to train only one set of weights. That is, for every weight w = w_1 + w_2, where w_2 is a constant and w_1 is to be optimized using an optimizer. Thus, in the forward step w is used to compute the loss, but in the backward step only w_1 is adjusted. Is this possible?

pascal_notsawo · May 6, 2021, 10:00am

It is possible.

Just make w1.requires_grad = True ~~ w1.requires_grad_(True) and w2.requires_grad_(False).

Look at the example below.

import torch
class Model(torch.nn.Module) :
    def __init__(self) :
        super().__init__()
        self.w1 = torch.nn.parameter.Parameter(data = torch.tensor(1.), requires_grad = True)
        self.w2 = torch.nn.parameter.Parameter(data = torch.tensor(2.), requires_grad = False)

    def forward(self, x) :
        w = self.w1 + self.w2
        return w * x  

def mae(y, y_pred) :
    """mean absolute error"""
    return (y - y_pred).abs().mean()


# data 
x = torch.tensor([1., 3, 3])
y = 2 * x
# model
model = Model()


# model.w1.grad and model.w2.grad are None 
y_pred = model(x) # tensor([3., 9., 9.], grad_fn=<MulBackward0>)
loss = mae(y_pred, y) # tensor(2.3333, grad_fn=<MeanBackward0>

loss.backward() 

# model.w1.grad is equal to tensor(2.3333) and model.w2.grad is always None (or 0 if you zero the gradients)

In this case, optimizer.step() will update the parameters. For example : w1 = w1 - learning_rate*w1.grad != w1 and w2 = w2 - learning_rate*w2.grad = w2 - 0 = w2 (In fact, optimizing it won’t even do that for w2, since w2 will be seen in the computation graph as a constant)

paulhofman · May 10, 2021, 8:03am

I understand, thank you for the example!