Hi everyone,
I would like to implement a module that I can perform some operations to intermediate layers such that:
(1) requires_gradient = False
for these operations a, b
(2) requires_gradient = True
for out
(3) force PyTorch to build Autograd graph such that the order is: input -> out (conv1) -> out (relu1)
. It means that PyTorch won’t need to take care of computing gradients for any function_a
and function_b
in the backward pass. The reason is: function_a
and function_b
were not provided by PyTorch, and it is very hard to compute their derivatives.
I am not sure whether it is feasible or not. Do you have any ideas how to proceed?
Thank you so much for reading.
A simple network can be seen as follows:
class CustomModule(nn.Module):
def __init__(self, module):
super(CustomModule, self).__init__()
self.module = module
def forward(self, input):
a = function_a(input) # no need gradient in backward
out = self.module(input) # need gradient in backward
b = function_b(out) # no need gradient in backward
return b
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = CustomModule(nn.Conv3d(1, 100, kernel_size=1))
self.relu1 = CustomModule(nn.ReLU())
def forward(self, input):
conv1 = self.conv1(input)
relu1 = self.relu1(conv1)
return relu1