do i permute dimentions correctly? (it doesn’t output error, but in test set it predicts same value over and over again)
above batch_size=64.
i am trying to input 64x3x100x100 picture.
In that case, try to overfit a small data sample (e.g. just 10 samples) and play around with the hyperparameters.
Once your model is able to overfit the small data, you could try to scale up the experiment and use more data samples.
If the model isn’t able to overfit, there might be some other bugs in the code, e.g. forgetting to zero out the gradients.