Garbage collection for MPS

I was working on implementing auto-regressive speculative generation using pytorch with huggingface transformers model, and found that the device memory keeps building up unless I free each and every new tensor allocation in my code. Does this usually happen? What is a clean method to avoid having to call del .. for each variable? Do I need to enable any flags for aggressive garbage collection?