PyTorch memory leak reference cycle in for loop

I cannot reproduce the issue on a 3090 and after adding print(torch.cuda.memory_allocated()/1024**2) to the for loop I get:

4.78515625
4.78515625
4.78515625
...

Is this post also related to this one you’ve created yesterday?
If so, could you create a GitHub issue for the potential memory leak on MPS?