I have some confusion regarding the correct way to freeze layers.

Suppose I have the following NN: layer1, layer2, layer3

I want to freeze the weights of layer2, and only update layer1 and layer3.

Based on other threads, I am aware of the following ways of achieving this goal.

Method 1:

- optim = {layer1, layer3}
- compute loss
- loss.backward()
- optim.step()

Method 2:

- layer2_requires_grad=False
- optim = {all layers with requires_grad = True}
- compute loss
- loss.backward()
- optim.step()

Method 3:

- optim = {layer1, layer2, layer3}
- layer2_old_weights = layer2.weight (this saves layer2 weights to a variable)
- compute loss
- loss.backward()
- optim.step()
- layer2.weight = layer2_old_weights (this sets layer2 weights to old weights)

Method 4:

- optim = {layer1, layer2, layer3}
- compute loss
- loss.backward()
- set layer2 gradients to 0
- optim.step()

My questions:

- Should we get different results for each method?
- Is any of these methods wrong?
- Is there a preferred method?