I am implementing a model such that
1- Set requires_grad = False under two conditions
ONLY for the two layers of the model
After feeding the FIRST batch of the data, from the second batch freeze and do not update the
2- after finishing the whole training process, I collect all the parameters’ grads ‘including the ones in the first two layers that were obtained only once with the first data batch’
I implemented it around 2 months ago and it worked but I set a new conda environment with the latest pytorch version and I am getting error that I have nontype gradients.
Are there new pytorch releases causing this error? if yes, any suggestions?