Memory leak in Drqv2

Hi!

I’m a rookie at reinforcement learning and trying to reproduce the result of Drqv2.

However, I met the memory leak problem when training drqv2 with 2,000,000+ training frames, which is categorized as a medium or hard task in drqv2.

After eliminating some unnecessary factors (i.e., environment and wrapper), I found that it’s the way that drqv2 saves and loads data that causes the memory leak. drqv2 writes replay buffer to the disk and loads it into IterableDataset, then samples a batch through DataLoader. I’m not sure if such a frequent operation by multiple workers causes the memory leak.

Has anyone encountered the same problem? I hope this will initiate some discussions and get some help. I’d be glad to collaborate positively.