they might be receiving gradients, as they might be part of computation graph dynamically created during forward()
function. But the parameters won’t be updated as the optimizer is not acting on those gradients.
Is there any specific behavior you see thats not consistent? It would be good to know.