Can i use loss.item() for backward?

No offence, but that is quite messy code to dig through :smile: Yes, you can set all optimizer to zero grad, but I suspect it will be easier if you just set zero_grad on the model - that way you only have to do it once.