Hi!
I’m a rookie at reinforcement learning and trying to reproduce the result of Drqv2.
However, I met the memory leak problem when training drqv2
with 2,000,000+ training frames, which is categorized as a medium or hard task in drqv2
.
After eliminating some unnecessary factors (i.e., environment and wrapper), I found that it’s the way that drqv2
saves and loads data that causes the memory leak. drqv2
writes replay buffer to the disk and loads it into IterableDataset
, then samples a batch through DataLoader
. I’m not sure if such a frequent operation by multiple workers causes the memory leak.
Has anyone encountered the same problem? I hope this will initiate some discussions and get some help. I’d be glad to collaborate positively.