The losses will be accumulated to loss
in the first loop and with them the complete computation graphs.
It’s thus expected that your memory usage would grow and your GPU might run out of memory.
You could have a look at this post and see if another approach would be suitable for your use case.
PS: you can post code snippets by wrapping them into three backticks ```, which would make debugging easier.