Model parameters to tensor and back to model

avish · June 3, 2021, 2:18pm

Hi, I have a model (nn.Module with multiple nested nn.Module as variable). I want to get all its parameter in a 1D vector and perform some operations on it, without changing length and put the result back into model as new parameters.

For getting parameter I am thinking of something like

                all_param = []
                for param in model.parameters():
                    all_param.append(param.view(-1))
                vec = torch.cat(all_param, dim=0)
                # do some operations on vec
                # ? put vec back into model.

I am looking into state dictionary to put back, but there can be nested module in model, so like parameters() just follows an ordered traversal over nesting, can I use it to update Module also or is there a shorter way to do so?

pascal_notsawo · June 3, 2021, 6:20pm

Can you give an example of “some operation”? Because what you are trying to do can be very inefficient on large models, especially if you do it repeatedly.

But one way to do it as you ask is to keep the initial shapes of the parameters in the first loop, and their length after view(). Once your operations are finished, you separate the vector resulting from the concatenation according to the lengths stored before, then you give the initial shapes back to each parameter.

avish · June 3, 2021, 7:17pm

Thank for reply.
These are small toy models, use case is an experiment I am trying to do to generate models.
Do you know, if I have weights of model in vector with correct shape how can I initialise a model with those weights.

soulitzer · June 3, 2021, 11:10pm

You can use copy_ under no_grad mode -

model = torch.nn.Linear(10, 10)

def f(t):
  return t * t

params = [f(p) for p in model.parameters()]

with torch.no_grad():
  params = [p.copy_(q) for (p, q) in zip(model.parameters(), params)]