I am trying to optimize memory consumption of a model and profiled it using memory_profiler. It appears to me that calling module.to(cuda_device)
copies to GPU RAM, but doesn’t release memory of CPU RAM.
Is there a way to reclaim some/most of CPU RAM that was originally allocated for loading/initialization after moving my modules to GPU?
Some more info:
Line 214, uses about 2GB to initialize my model.
Line 221 or later, I no longer need this CPU RAM stuff, and I am trying to empty it (even with forced GC in line 224 didn’t help!)
Line # Mem usage Increment Line Contents
================================================
209 88.7 MiB 88.7 MiB @profile
210 def __init__(self, exp: Experiment, model=None, lr=0.0001):
211 88.7 MiB 0.0 MiB self.exp = exp
212 88.7 MiB 0.0 MiB self.start_epoch = 0
213 88.7 MiB 0.0 MiB if model is None:
214 2159.7 MiB 2071.0 MiB model = Seq2Seq(**exp.get_model_args()).to(device)
215 2159.7 MiB 0.0 MiB last_check_pt, last_epoch = self.exp.get_last_saved_model()
216 2159.7 MiB 0.0 MiB if last_check_pt:
217 log.info(f"Resuming training from epoch:{self.start_epoch}, model={last_check_pt}")
218 self.start_epoch = last_epoch + 1
219 model.load_state_dict(torch.load(last_check_pt))
220 2159.7 MiB 0.0 MiB log.info(f"Moving model to device = {device}")
221 2159.7 MiB 0.0 MiB self.model = model.to(device=device)
222 2159.7 MiB 0.0 MiB self.model.train()
223 2159.7 MiB 0.0 MiB del model # this was on CPU, free that memory
224 2159.7 MiB 0.0 MiB gc.collect() # should the GC cleanup CPU buffers after moving to GPU ?
225 2159.7 MiB 0.0 MiB self.optimizer = optim.Adam(self.model.parameters(), lr=lr)
Edit:
version pytorch 0.4.0 on linux-64