Hi, I have some problems with fine-tuning the last layer of a Neural network.

Code is as followed:

```
for layer in self.layers[0:-1]:
for param in self.model.fc_layers[layer].parameters():
param.requires_grad = False
y_prime = self.model(X)
loss = self.model.criterion(y_prime, y)
self.model.optimize.zero_grad()
self.model.zero_grad()
loss.backward(retain_graph=1)
self.model.optimize.step()
```

Model is a Neural network.

It didnât work as I fine-tune the last layer(re-train), I get the same result while I didnât fix the parameters of any layers.

something is wrong here, but I canât figure it out.

Could you check, if the parameters of the frozen layers get valid gradients after the `backward`

call via:

```
loss.backward()
print(self.model.frozen_layer.weight.grad)
print(self.model.last_layer.weight.grad)
```

If the frozen layers yield `None`

, while the last linear layer yields a valid gradient, your code is working as expected.

hi, thx.

It has the error: the model has no ââfrozen_layerââ module.

So I change it to

```
print(self.model.fc_layers[-2].weight.grad)
print(self.model.last_layer.weight.grad)
```

Since the second last layer is frozen as I wrote the code in question.

But the output is not âNoneâ, is the gradient matrix.

So it is not working as expected I think.

Yes, the `frozen_layer`

and `last_layer`

are just placeholder names for your layer names.

Did you train the âfrozenâ layers before and forgot to zero out the gradients?

If not, could you post a code snippet which represents your training routine and how and where you are freezing the layers?

Your initial code snippet looks like it has some indentation issues, as your âlayer loopâ would be used for each training iteration.

Is this a copy-paste error or are you using this code in your script as shown here?

oh, sorry, yes, itâs a copy-paste mistake.

I train the frozen layer before, the function is as followed:

```
def step_mlp(self,X,y):
y_prime = self(X)
loss = self.criterion(y_prime, y)
self.optimize.zero_grad()
self.zero_grad()
loss.backward(retain_graph=1)
self.optimize.step()
```

Then re-train the model by frozen the layers except the last layer as the code in question decription.

Frozen layer:

```
for layer in self.layers[0:-1]:
for param in self.model.fc_layers[layer].parameters():
param.requires_grad = False
```

Then re-train:

```
y_prime = self.model(X)
loss = self.model.criterion(y_prime, y)
self.model.optimize.zero_grad()
self.model.zero_grad()
loss.backward(retain_graph=1)
self.model.optimize.step()
```

Your general workflow should work as shown in this small example:

```
# Setup
model = models.resnet18()
data = torch.randn(1, 3, 224, 224)
target = torch.randint(0, 1000, (1,))
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)
# Train
optimizer.zero_grad()
out = model(data)
loss = criterion(out, target)
loss.backward()
optimizer.step()
# Freeze all but last layer
for name, param in model.named_parameters():
if not 'fc' in name:
param.requires_grad = False
optimizer.zero_grad()
out = model(data)
loss = criterion(out, target)
loss.backward()
# Check grads
grad_frozen = model.conv1.weight.grad
grad_fc = model.fc.weight.grad
print(grad_frozen.abs().sum())
print(grad_fc.abs().sum())
optimizer.step()
```

As you can see, after freezing all parameters but the `fc.weight`

and `fc.bias`

, I get a zero gradient for a previous layer, while I get valid gradients for the last linear layer.

Which layers are you checking and are you sure the checked frozen layer is in `model.fc_layers`

?

2 Likes

Thanks for your detailed example, ptrblck!

Find the error. It turns out that I didnât have any frozen layers in `model.fc_layers`

because I make a mistake in my code.

Your reply helps me to find it. Thanks a lot!

1 Like