Hi. I am trying to define a module that accepts a tensor of parameters but allowing to control the slice of this tensor that will actually be learned, for example:
class MyModule(nn.Module):
def __init__(self, m, learn_slice):
super().__init__()
self.m = m
self.m_params = nn.Parameter(m[learn_slice])
self.m[learn_slice] = self.m_params
def forward(self, x):
return torch.dot(self.m, x)
When I evaluate the gradient for the first time it works well, but after the second backward()
call I get the following error:
m = torch.rand(100)
x = torch.rand(100)
model = MyModule(m, slice(10, 20))
y = model(x)
y.backward()
y = model(x)
y.backward()
RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.
If I always add retain_graph=True
to backward()
it does work, but I am not sure whether it is the right way of doing this.
What am I missing here? Thanks!