I am trying to train an inceptionV3 model with a batch size of 256. The whole dataset does not exceed 5MB and the size of the model is around 45MB, so i have no problem in loading the model and batches of data.
the error i am getting looks like this:
The code crashes in the first epoch, first batch and during the first forward pass. Am I doing something wrong or is this an expected behavior. The forward pass takes about 12GB on its own
This is my code:
(ignore the cpu memory usage, it is not reported correctly)
Thanks!