I wanted to test multi-gpu with simple DQN algorithm for Cartpole environment, and it seems that either it is not possible or I am missing something. I run the code in a machine which has two gpus. I have my code here :
The problem is that in each train step, I need to obtain the target value, which is a DataParallel object, then multiply it to some elements in the batch (which is a tensor) and then use the output to obtain the loss. When, I multiply target value to the tensor, I am getting this error:
RuntimeError: The size of tensor a (64) must match the size of tensor b (128) at non-singleton dimension 1
DataParallel has split the batch into two parts, I have target shape of
2*64 instead of
128. I can reshape the tensor to
2*64 and make it work, but I it is not gonna work If I have more gpus and reshaping would be hard and messy. I thought there should be a better way to do this.
I appreciate any help or comment.