Requires_grad = False after first batch

I am implementing a model such that
1- Set requires_grad = False under two conditions
ONLY for the two layers of the model
After feeding the FIRST batch of the data, from the second batch freeze and do not update the
2- after finishing the whole training process, I collect all the parameters’ grads ‘including the ones in the first two layers that were obtained only once with the first data batch’

I implemented it around 2 months ago and it worked but I set a new conda environment with the latest pytorch version and I am getting error that I have nontype gradients.
Are there new pytorch releases causing this error? if yes, any suggestions?

The latest PyTorch 2.0.0 release sets the set_to_none argument to True by default in the zero_grad call, which will delete the .grad attribute of referenced parameters and will thus save memory.
I don’t know what exactly your code is doing, but maybe setting this attribute to False again allows your code to run?