I’m running an off policy rl algorithm with deepminds pysc2, and i am finding im quickly running out of gpu memory. My pc does only have 4 gig of vram, so if this is a bad plan from the start just let me know.
Essentially the run loop of the program goes:
Actor and critic initialised on gpu
- observe environment
- process observations (into cuda tensors, such as minimap_features, a 1 x 4 x 64 x 64 tensor)
- actor.forward(processed observations)
- store a bunch of information in the replay buffer (am storing, for example minimap_features.cpu())
the problem im having appears to be that the memory is never being deallocated, with the cuda.memory_allocated and max_memory_allocated remaining (almost) the same as eachother and constantly, linearly increasing. My guessing is that something I am doing is maintaining references to previously calculated variables which should have been cleared. Any advice? (if you want to know more about a specific part of the code i can show)