Fastest way to perform validation step?

I am new to pytorch
I want to ask about the most efficient way of doing validation step. My training data is about 30 mil, batch size is 128 and doing validation step with about 15% of the data every 2000th training step, to check for early stopping. Training happens on gpu and currently validation step as well. (I have to stick to this 15% fraction, cant reduce it for an external reason)

I notice that validation step is taking a lot of time, almost as much as the 2000 training step. 2000 training steps take about 250 s while validation takes about 160 s. Is this typical? This looks quite inefficient. I tried increasing the validation batch size but i run out of cuda memory. Should i do it on cpu instead? Is there a general wisdom about this that i dont know about?
Thank you for your answers.

If you haven’t wrapped the validation step into a with torch.no_grad() guard to save memory, you should do it. :wink:
Besides that there is no secret and the timing might be expected depending on your model, data loading etc.