I’m trying to compute some metrics across all parameters of my model. While these metrics are simple (e.g. topk()) and available as methods on Tensor class I can’t use these methods directly because I get a list of tensors with different sizes via model.parameters().
So I have found that it’s easier to just flatten everything in model.parameter() list and then concatenate them into a single tensor and then compute the metric.
This works fine but I also need to modify my original parameters based on this computed metric. For example imagine I want to keep the largest 10% parameters in my model and set everything else to zero. What’s the most elegant/pythonic way to do this?
I tried to put something together… let me know if this helps you
def flatten_params(parameters):
"""
flattens all parameters into a single column vector. Returns the dictionary to recover them
:param: parameters: a generator or list of all the parameters
:return: a dictionary: {"params": [#params, 1],
"indices": [(start index, end index) for each param] **Note end index in uninclusive**
"""
l = [torch.flatten(p) for p in parameters]
indices = []
s = 0
for p in l:
size = p.shape[0]
indices.append((s, s+size))
s += size
flat = torch.cat(l).view(-1, 1)
return {"params": flat, "indices": indices}
def recover_flattened(flat_params, indices, model):
"""
Gives a list of recovered parameters from their flattened form
:param flat_params: [#params, 1]
:param indices: a list detaling the start and end index of each param [(start, end) for param]
:param model: the model that gives the params with correct shapes
:return: the params, reshaped to the ones in the model, with the same order as those in the model
"""
l = [flat_params[s:e] for (s, e) in indices]
for i, p in enumerate(model.parameters()):
l[i] = l[i].view(*p.shape)
return l
@Evan_Vogelbaum How can I get all the flattened model parameters as numpy 1-D array or column vector through flatten_params() function? As @milad mentioned, after some computations on the weights, I want to unflatten the modified weights or replace the modified weights in the original model. Kindly help. Thank you.
If you inspect the grad_fn from the output of torch.cat(), it does show backprop will still connect to original tensors (I believe), but yeah, aliasing won’t happen.
I’m wondering if there’s a way to stack a tensor s.t. that’s still “raggedly” aliases / views, similar in nature to how fn’s like Tensor.expand() work. (My naive guess is nah, but still curious nonetheless.)
EDIT: Nahhhh, I ain’t seeing anything in torch.Tensor docs; will def. use flatten/unflattening.