How to more aggressively release GPU memory?

TBH, I think that loading the model iteratively should be worse since you are creating multiple models, and that should crash your gpu…