How to minimize the reserved GPU memory?

Thank you for your reply. Yes, I tried to use torch.cuda.empty_cache(), but someone asked about it, which can slow down the training time (please see here). Where would you recommend inserting it?