GPU out of memory during forward pass in inceptionV3

I am trying to train an inceptionV3 model with a batch size of 256. The whole dataset does not exceed 5MB and the size of the model is around 45MB, so i have no problem in loading the model and batches of data.
the error i am getting looks like this:

The code crashes in the first epoch, first batch and during the first forward pass. Am I doing something wrong or is this an expected behavior. The forward pass takes about 12GB on its own

This is my code:

(ignore the cpu memory usage, it is not reported correctly)


The OOM might be expected since the forward activations, needed to compute the gradients, could take the majority of the memory as described here.

Thanks, I appreciate your help!