In TF, Conv2d filter shape is [filter_height, filter_width, in_channels, out_channels], while in Pytorch is (out_channels, in_channels, kernel_size[0], kernel_size[1]).
So I have done below in TF:
and I transfer to pytorch like:
It turns out that the DQN in pytorch is not working well as in TF!
In case someone else gets here and has the same issue, I think that the problem is using reshape before transpose.
I have loaded TF weights with PyTorch by permuting the weight Tensor, and it worked fine.