How to flatten and then unflatten _all_ model parameters

milad · January 15, 2019, 4:34pm

I’m trying to compute some metrics across all parameters of my model. While these metrics are simple (e.g. topk()) and available as methods on Tensor class I can’t use these methods directly because I get a list of tensors with different sizes via model.parameters().

So I have found that it’s easier to just flatten everything in model.parameter() list and then concatenate them into a single tensor and then compute the metric.

This works fine but I also need to modify my original parameters based on this computed metric. For example imagine I want to keep the largest 10% parameters in my model and set everything else to zero. What’s the most elegant/pythonic way to do this?

Evan_Vogelbaum · September 19, 2019, 3:19pm

I tried to put something together… let me know if this helps you

def flatten_params(parameters):
    """
    flattens all parameters into a single column vector. Returns the dictionary to recover them
    :param: parameters: a generator or list of all the parameters
    :return: a dictionary: {"params": [#params, 1],
    "indices": [(start index, end index) for each param] **Note end index in uninclusive**

    """
    l = [torch.flatten(p) for p in parameters]
    indices = []
    s = 0
    for p in l:
        size = p.shape[0]
        indices.append((s, s+size))
        s += size
    flat = torch.cat(l).view(-1, 1)
    return {"params": flat, "indices": indices}


def recover_flattened(flat_params, indices, model):
    """
    Gives a list of recovered parameters from their flattened form
    :param flat_params: [#params, 1]
    :param indices: a list detaling the start and end index of each param [(start, end) for param]
    :param model: the model that gives the params with correct shapes
    :return: the params, reshaped to the ones in the model, with the same order as those in the model
    """
    l = [flat_params[s:e] for (s, e) in indices]
    for i, p in enumerate(model.parameters()):
        l[i] = l[i].view(*p.shape)
    return l

Ajinkya_Bankar · January 28, 2021, 3:59pm

@milad Did you get the solution? If yes, please share. I am also having the same question. Thank you.

Ajinkya_Bankar · January 28, 2021, 4:05pm

@Evan_Vogelbaum How can I get all the flattened model parameters as numpy 1-D array or column vector through flatten_params() function? As @milad mentioned, after some computations on the weights, I want to unflatten the modified weights or replace the modified weights in the original model. Kindly help. Thank you.

eacousineau · March 28, 2021, 5:23pm

Howdy @Ajinkya_Bankar! Perhaps you’re looking for these functions; I’m sure you could find more succinct usages similar to your use case in Torch tutorials and examples if you search for these fns in their awesome docs:
https://pytorch.org/docs/stable/generated/torch.from_numpy.html#torch.from_numpy
https://pytorch.org/docs/stable/tensors.html#torch.Tensor.numpy

Be sure to review how .detach() and device transfer works.

eacousineau · March 28, 2021, 5:26pm

To the OP’s post and @Evan_Vogelbaum’s sln - thanks!

If you inspect the grad_fn from the output of torch.cat(), it does show backprop will still connect to original tensors (I believe), but yeah, aliasing won’t happen.

I’m wondering if there’s a way to stack a tensor s.t. that’s still “raggedly” aliases / views, similar in nature to how fn’s like Tensor.expand() work. (My naive guess is nah, but still curious nonetheless.)

EDIT: Nahhhh, I ain’t seeing anything in torch.Tensor docs; will def. use flatten/unflattening.

eacousineau · March 29, 2021, 2:41am

As a form of masochism, tinkered with this briefly.

It is possible if you go the other way: reconstruct full vector, then replace each Parameter with one constructed from a tensor view. See this notebook for code + example (much longer / more complex than Evan’s sln):
EricCousineau-TRI/repro/blob/77942acfda/python/torch/torch_view_alias_no_backprop_tho.ipynb

However, the backprop / grad_fn gets lost b/c nn.Parameter must be a leaf tensor. See:
https://pytorch.org/docs/1.8.1/_modules/torch/nn/modules/module.html#Module.register_parameter

EDIT: Also, dunno how device transfer may affect aliasing views.

sad_robot · January 26, 2022, 9:13pm

It looks like the way to do this is:

TORCH.NN.UTILS.PARAMETERS_TO_VECTOR
and
TORCH.NN.UTILS.VECTOR_TO_PARAMETERS

Specifically something like:

param_vec = th.nn.utils.parameters_to_vector(model.parameters().values())
param_vec += something # arbitrary operation
th.nn.utils.vector_to_parameters(param_vec, model.parameters().values())