Thanks a lot, so is it wrong to pass
torch.Size([256, 64, 1, 1])
to a softmax(1)?
i am not getting any error but my accuracy has dropped significantly and it seems softmax is not working as expected hence thought maybe i should change the arg inside softmax from 1 to something else that I am not sure what it is.
a bit explanation here How to stack adaptiveavgpool2D and softmax?