Any way to treat a collection of Variables as a single Variable?

Hi all,

Often in papers, notation is simplified by pooling all parameters into a single variable “theta”, and describe various operations in terms of that theta.

For example say for some reason you’re manually implementing momentum sgd (yes I know that this is usually done internally, but just for the sake of the example…). Momentum sgd is described by the pseudocode:

v = gamma*v + eta*grad(L, theta)
theta = theta - v

In reality, if theta is a bunch of parameters, you go:

dL_dtheta = grad(L, theta)
v = [gamma*v_ + eta*dl_dtheta_ for v_, dl_dtheta_ in zip(v, dL_dtheta)]
theta = [theta_ - v_ for v_, theta_ in zip(v, theta)]

But this is quite messy - as you end up with list expansions everywhere. It would be really nice to have a “CollectionVariable” or something, where we could have a variable that represents the a whole collection, and where you could apply all kinds of elementary operations (+, -, *, …) on. So you could write your code in the clean form above and still be able to work with situations where your parameters are not all in one array.

Does PyTorch have anything like this, or if not do anticipate any difficulties in implementing such a thing?

Pytorch doesn’t have such a functionality by default.
Lua-Torch used to have such functionality (getParameters() would return a view of all the parameters concatenated into a single huge tensor), but it turned out there were many corner cases that made a proper implementation difficult.
But you can always have an approximation of it using simple elementary functions. For example, to append all the parameters in a single tensor in PyTorch, you could do something like

param_list = list(p.view(-1) for p in model.parameters())
concat_param = torch.cat(param_list)
1 Like