Hi,
I am implementing a model such that
1- Set requires_grad = False under two conditions
ONLY for the two layers of the model
After feeding the FIRST batch of the data, from the second batch freeze and do not update the
weights.
2- after finishing the whole training process, I collect all the parameters’ grads ‘including the ones in the first two layers that were obtained only once with the first data batch’
I implemented it around 2 months ago and it worked but I set a new conda environment with the latest pytorch version and I am getting error that I have nontype gradients.
Are there new pytorch releases causing this error? if yes, any suggestions?
Thanks!