How to put the old_weight back

Pedro_7773 · October 21, 2022, 11:18am

What I am trying to do is to put the the old parameters after the update
here is a demo code.

        for param in net.parameters():
            d_p = param.grad.data
            param.data.sub_(d_p * lr)
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        for param in net.parameters():  
            param.data.add_(d_p * lr)

KFrank · October 21, 2022, 10:00pm

Hi Pedro!

First, don’t use data – it is deprecated and breaks things.

You may modify param inplace by wrapping the operation in a
with torch.no_grad(): block:

        with torch.no_grad():
            for param in net.parameters():
                d_p = param.grad
                param.sub_(d_p * lr)
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        with torch.no_grad():
            for param in net.parameters():
                d_p = param.grad
                param.add_(d_p * lr)

(If you don’t want to backpropagate though your outputs and loss
computations, you could wrap the whole thing in a single no_grad()
block.)

Note that due to round-off error, sub_() followed by add_() won’t
necessarily take you back to exactly where you started. If you need
to restore param exactly, save a copy of param and then write it back
after the modification, e.g.

        with torch.no_grad():
            param.copy_ (copy_of_original_param)

Best

K. Frank

Pedro_7773 · October 22, 2022, 4:11pm

Thanks a lot KFrank
But i am confused how to copy parameters in the first place in copy_of_original_param , can you illustrate it further please?

KFrank · October 22, 2022, 9:09pm

Hi Pedro!

Note that param is different for each iteration of your for loop, so you
will need to store a list of the parameter values:

        param_list = []
        with torch.no_grad():
            for param in net.parameters():
                param_list.append (param.clone())
                d_p = param.grad
                param.sub_(d_p * lr)

(If for some reason param.clone() isn’t in the no_grad() block, you
would want param_list.append (param.detach().clone()).)

Best.

K. Frank