CrossEntropyLoss - Expected object of type torch.LongTensor

Thanks for the code!
There are some small differences. While your TF/Keras models use 32 kernels in the first conv layer, you are using 64. Just change the second argument in your first conv layer to 32.
Also your reference models use another dropout layer before the last linear layer.
Could you change this and try to train your model again?

Also, PyTorch initializes the conv and linear weights with kaiming_uniform and the bias with a uniform distribution using fan_in by default.
If you want to copy Keras’ initialization you could use the following code:

def weight_init(m):
    if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
        nn.init.xavier_uniform_(m.weight, gain=nn.init.calculate_gain('relu'))
        nn.init.zeros_(m.bias)

model.apply(weight_init)
4 Likes