I was having trouble with memory usage of my traced model in C++, and I discovered that .eval()
doesn’t change the requires_grad
for the parameters in my ScriptModule. Is this intended behaviour? I can say that as a user this was not expected behaviour. As a user I would like it to work as I expect, to warn, or to raise. Given that I can manually set the requires_grad behaviour, it seems like my expected behaviour is possible?
I think the underlying cause is that my_script_module.layer
is a RecursiveScriptModule
and has no .children()
.
PyTorch 1.5
import torch
class MyScriptModule(torch.jit.ScriptModule):
def __init__(self):
super().__init__()
self.layer = torch.nn.Linear(1, 1, bias=False)
my_script_module = MyScriptModule()
# [True]
print([p.requires_grad for p in my_script_module.parameters()])
my_script_module.eval()
# [True] :(
print([p.requires_grad for p in my_script_module.parameters()])
for p in my_script_module.parameters():
p.requires_grad = False
# [False] :)
print([p.requires_grad for p in my_script_module.parameters()])
I know this isn’t a totally normal thing to be doing. For what it’s worth, I am subclassing ScrtipModule
in this way so that I can do the following. Maybe I should do something differently?
class MyModule(torch.nn.Module):
def __init__(self):
super().__init__()
self.inner = MyScriptModule()
def forward(self, x):
# stuff
x = self.inner(x)
# more stuff
return x
class MyScriptModule(torch.jit.ScriptModule):
""" Psuedo-code """
def __init__(self):
super().__init__()
self.layer = torch.nn.Linear(1, 1, bias=False)
@torch.jit.script_method
def forward(self, x):
out = torch.zeros_like(x)
for i in range(x.size()[0]):
out = self.layer(x)
return out
my_module = MyModule()
my_module.eval()
torch.jit.trace(my_module, sample)