Question about require_grad

SKYHOWIE25 · December 7, 2017, 7:26am

Hi

Suppose I have a CNN that has 3 layers: conv1 - conv2 - fc1.

If I configure the parameters in conv2 with require_grad = False. Then the gradient for conv2 will only be calculated but not updated, right? and will the parameters in conv1 updated properly?

Thanks.

alexis-jacq · December 7, 2017, 8:49am

If you set requires_grad=False for conv2, then conv2.grad = None. But you can still compute and update the gradient of conv1. A simpler example with 3 variables:

x = Variable(torch.rand(5), requires_grad=True)
y = Variable(torch.rand(5), requires_grad=True)
z = Variable(torch.rand(5), requires_grad=False)

a = x*y # conv1
b = a*z # conv2
c = torch.sum(b) # fc1

c.backward()
print(torch.sum(y.grad - x*z)**2)

Variable containing:
0
[torch.FloatTensor of size 1]

SKYHOWIE25 · December 7, 2017, 9:06am

Oh, I thought the gradient for conv2 will only be computed for the chain rule but the parameters will not be updated. Thanks!