Why not use detach when freezing?

def set_parameter_requires_grad(model, feature_extracting):
    if feature_extracting:
        for param in model.parameters():
            param.requires_grad = False

Why use only require_grad = False?

  1. Detach requires hard coding inside the model.
  2. Detach requires knowledge about how layers are connected.
    On the other side requires_grad false is external code and does not require any knowledge about model graph.
    You can just not to pass layers to the optimizer, but there are drawbacks too (gradients keep acumulating)

Does detach automatically change to require_grad = False?

Detach does not change requires grad. Detach is a tensor’s method which breaks model’s graph when used.

1 Like

Thank you very much !!