Proper way to assign tensor to a Parameter

grudloff · February 9, 2022, 7:08pm

A parameter can be set to an arbitrary tensor by use of the .data. I have read that this is discouraged, what would be the proper way of doing this?

This issue arose for me in the context of using a reparametrization, when one is used, one can assign the parameter directly without the use of .data. Furthermore, using .data doesn’t modify the parameter.

import torch
from torch import nn
from torch.nn.utils.parametrize import register_parametrization

class Identity(nn.Module):
    def forward(self, X):
        return X
    def right_inverse(self, Z):
        return Z

linear = nn.Linear(1,2) 
# this raises an error
linear.weight = torch.ones_like(linear.weight)
# this works
register_parametrization(linear, 'weight', Identity())
linear.weight = torch.ones_like(linear.weight)
# this does nothing
register_parametrization(linear, 'weight', Identity())
linear.weight.data = torch.ones_like(linear.weight)

Then, if I have a model definition, it must be dependent on whether it is reparametrized or not. Is there a “proper” way of assigning an arbitrary tensor to a parameter that works in these two cases?

KFrank · February 9, 2022, 9:17pm

Hi Gabriel!

You may assign another Parameter to the Linear’s weight property
(although I am not sure that I understand your use case).

Consider:

>>> import torch
>>> torch.__version__
'1.10.2'
>>> linear = torch.nn.Linear (1, 2)
>>> linear.weight
Parameter containing:
tensor([[-0.9916],
        [ 0.0040]], requires_grad=True)
>>> linear.weight = torch.nn.Parameter (torch.ones_like (linear.weight))
>>> linear.weight
Parameter containing:
tensor([[1.],
        [1.]], requires_grad=True)
>>> linear (torch.tensor ([0.123])).sum().backward()
>>> linear.weight.grad
tensor([[0.1230],
        [0.1230]])

Best.

K. Frank

thatgeeman · February 9, 2022, 11:12pm

Yup, not sure of the use actual use case but there’s a similar thread here.

grudloff · February 10, 2022, 1:41am

Thank you both for your replies!

My usecase is in the context of a parameter that represents a physical value of a simulation, so essentially it’s a modeling problem. I can change the values of the simulation and I can either retrain that parameter or, in this case, I want to set it to focus on something else.

@KFrank If I use the Parameter wrapper on the assignation it won’t work if it has a parametrization. This means:

linear = nn.Linear(1,2)
w = ...
# This works
linear.weight = nn.Parameter(w)
register_parametrization(linear, 'weight', Identity())
# This doesn't works
linear.weight = nn.Parameter(w)

The second assignation throws KeyError: "attribute 'weight' already exists"

@thatgeeman on the thread you link, @ptrblck emphasizes the use of no_grad (I think) for not messing up the grad calculation, but he doesn’t address what is a proper assignation. What I mean is that, for example, wrapping layer.weight = w in no_grad won’t work if there is no parametrization.

On a sidenote, related to the use of no_grad. If a tensor is assigned to a parameter, how could this mess up grad calculations if it isn’t inside a with no_grad?

thatgeeman · February 11, 2022, 10:08am

From the docs:

The first time that a module registers a parametrization, this function will add an attribute parametrizations

parametrizing weight , layer.weight is turned into a Python property. This property computes parametrization(weight) every time we request layer.weight

This parametrizations attribute is an nn.ModuleDict

The list of parametrizations on the tensor weight will be accessible under module.parametrizations.weight

After parametrization, directly modifying linear.weight = w (w being an arbitrary tensor) should work. But re-registering a new parametrization with the same name weight may be erroneous. Maybe you want to remove the existing parametrization (see docs) before?

In his answer, the usage of no_grad contextmanger is demonstrated where the weights are in addition being manipulated I believe.

grudloff · February 11, 2022, 3:17pm

Yes, you are right! I guess I should reformulate my question. How can I have an assignation of a parameter that works both for a vanilla parameter and for a parametrized parameter? Because right now I don’t see a way that works in both cases.

Edit: I could add an if-else clause, but this would make the code cumbersome. Something like this:

    if linear.parametrizations:
        linear.weight = tensor(w)
    else:
        model.linear = nn.Parameter(tensor(w))