Hello, I’m implementing Deep Q-learning and my code is slow due to the creation of Tensors from the replay buffer. Here’s how it goes:
I maintain a deque
with a size of 10’000 and sample a batch from it everytime I want to do a backward pass. The following line is really slow:
curr_graphs = torch.Tensor(list(state(*zip(*xp_samples.curr_state)).graph))
That I decomposed to see what is really taking time
zipped = zip(*xp_samples.curr_state)
new_s = state(*zipped)
listt = list(new_s.graph)
curr_graphs = torch.Tensor(listt)
To notice that the last line, i.e the tensor creation, is what is taking all the computation. What is happening is that xp_samples and curr_state are named tuples. In this snippet I unpack and then zip and then unpack again to group the data by name from curr_state.
In my opinion it has to assemble data from memory using a lot of pointers to create the Tensor and thus is losing time moving things around. What would be the fastest way to create a tensor from sampled data from a buffer that I maintain? Should I allocate the size of the content of the deque so that it is continous in memory? I feel like it won’t speed up the process.
Here are the details of the deque
if that’s relevant:
class ReplayBuffer:
def __init__(self, maxlen):
self.buffer = deque(maxlen=maxlen)
def add(self, new_xp):
self.buffer.append(new_xp)
def sample(self, batch_size):
xps = random.choices(self.buffer, k=batch_size)
return xp(*zip(*xps))