Often in papers, notation is simplified by pooling all parameters into a single variable “theta”, and describe various operations in terms of that theta.
For example say for some reason you’re manually implementing momentum sgd (yes I know that this is usually done internally, but just for the sake of the example…). Momentum sgd is described by the pseudocode:
v = gamma*v + eta*grad(L, theta) theta = theta - v
In reality, if theta is a bunch of parameters, you go:
dL_dtheta = grad(L, theta) v = [gamma*v_ + eta*dl_dtheta_ for v_, dl_dtheta_ in zip(v, dL_dtheta)] theta = [theta_ - v_ for v_, theta_ in zip(v, theta)]
But this is quite messy - as you end up with list expansions everywhere. It would be really nice to have a “CollectionVariable” or something, where we could have a variable that represents the a whole collection, and where you could apply all kinds of elementary operations (+, -, *, …) on. So you could write your code in the clean form above and still be able to work with situations where your parameters are not all in one array.
Does PyTorch have anything like this, or if not do anticipate any difficulties in implementing such a thing?