It’s necessary. If you do not set requires_grad = False,and just optimizer.set_parameters(former_model.parameters()) to enable not to update parameters,
but latter_model’s gradients are still computed, causing to take up much GPU-memory
It’s necessary. If you do not set requires_grad = False,and just optimizer.set_parameters(former_model.parameters()) to enable not to update parameters,
but latter_model’s gradients are still computed, causing to take up much GPU-memory