Is it possible to use a linear layer (with the same input and output size) in-place? I don’t care about the gradients (torch.no_grad() is enabled). I want to use as little memory as possible because I’m querying the network many thousands of times per batch item (working with 3D point clouds).
I created a method to do this by looking at the nn.Functional.linear function:
def linear_inplace(layer, v): return torch.addmm(layer.bias, v, layer.weight.t(), out=v)
However, for some reason my model’s layers are not moved to the GPU if I use this method instead of the standard
__call__ method of nn.Linear. I get the following error:
RuntimeError: Tensor for argument #3 'mat2' is on CPU, but expected it to be on GPU (while checking arguments for addmm)
I know my cuda device is properly selected because it works with layer(v), but not with my linear_inplace(layer, v). Can someone help me understand what’s going on here?
I’ve tracked the error to a problem with weight_norm, and opened an issue.