I use a torch.eye(.)
during the computation of a function. As it doesn’t change shape, so I decided to precompute it within my class once at the constructor instead of generating it on-the-fly during the forward
call. Dummy example:
import torch
import torch.nn as nn
class Foo(nn.Module):
def __init__(self, device='cuda'):
super(Foo, self).__init__()
self.weights = nn.Parameter(torch.Tensor(4, 4))
self.eye = torch.eye(4, device=device, requires_grad=False)
torch.nn.init.uniform_(self.weights)
def forward(self):
return self.weights - self.eye
I have only added the weights as nn.Parameter
so it won’t return self.eye
when I call model.parameters()
and added requires_grad=False
to make sure.
Should I be worried of any unintended behavior this could have in my computational graph?
This solution in contrast to this:
class Foo(nn.Module):
def __init__(self, device='cuda'):
super(Foo, self).__init__()
self.weights = nn.Parameter(torch.Tensor(4, 4))
self.device = device
torch.nn.init.uniform_(self.weights)
def forward(self):
return self.weights - torch.eye(4, device=self.device)
Another question, when moving my model to a device, e.g.,
model = Foo()
model.to('cuda')
it correctly moves self.weights
to the GPU, but not self.eye
(PyTorch version 1.1.0). So I had to add a device parameter manually to ensure it was in the right device. Is it due to not being a nn.Parameter
?