ParameterList and list of buffers in the same module

11158 · September 14, 2020, 7:59am

I want to have a Module that is parametrized by a list of tensors. In the situation, when this Module is a “leaf” module in the pipeline (i.e. its parameters are trainable parameters), then it’s pretty simple: I just have to use ParameterList to store its parameters.

The question is, what should I do, if it’s not a “leaf” module, i.e. I somehow constructing it’s parameter tensors and only after that instantiating the module.

Consider this toy example: I want to encapsulate a rang-1 matrix decomposition:

class Rang1Matrix(nn.Module):
    def __init__(self, vectors):
        super().__init__()
        self.vectors = vectors
    
    def forward(self):
        return  t.matmul(self.vectors[0], self.vectors[1])

In this block of code, if I want to create a module with trainable parameters, then I just have to pass ParameterList as a “vectors” argument. But what should I do, if I already have tensors “vector”, through which I want autodifferentiate?

In particular, I want Rang1Matrix to fully support functions like .cuda(), .cpu(), .to(), etc.

albanD · September 14, 2020, 3:50pm

Hi,

You want gradients to flow back all the way to the vectors that were given as input to the __init__ function?
If so, your code will work fine.
But if you want it to work with .to()-like functions, it is harder. Because then you want these to be done in a differentiable manner, which they don’t really do. And this will create part of the graph that will be re-used between forward passes and that can cause your backward to fail because some buffers have been freed.