Suppose I have a CNN that has 3 layers: conv1 - conv2 - fc1.
If I configure the parameters in conv2 with require_grad = False. Then the gradient for conv2 will only be calculated but not updated, right? and will the parameters in conv1 updated properly?
If you set requires_grad=False for conv2, then conv2.grad = None. But you can still compute and update the gradient of conv1. A simpler example with 3 variables:
x = Variable(torch.rand(5), requires_grad=True)
y = Variable(torch.rand(5), requires_grad=True)
z = Variable(torch.rand(5), requires_grad=False)
a = x*y # conv1
b = a*z # conv2
c = torch.sum(b) # fc1
c.backward()
print(torch.sum(y.grad - x*z)**2)
Variable containing:
0
[torch.FloatTensor of size 1]