Hi, I am trying to manually set the gradients of the entire network during training. Specifically I am interested to do the following:
optimizer.zero_grad()
# set gradients for all components of the network
optimizer.step()
I know that for a single module we can use following:
But it would be complicated to set each module of the entire network in this way since the network may contain dozens of different modules. I am wondering if there is a better way to do so? I am thinking of possibilities of using something similar to state_dict
when copying weights for the entire network, just wonder if there are similar things for gradients?
Thank you!