Previously, I learned that when the input size is not fixed, we should set cudnn.benchmark=False for faster speed.
My input size is not fixed, when I set cudnn.benchmark=True, it runs smoothly. But when I set it to be False, it runs into OOM easily.
Why does this happens? When it is set to be True, there are no OOM errors, which means that my data and my model could fit into the GPU memory, then what causes the OOMs when it is set to be False?
And I guess setting cudnn.benchmark to be True do affect my program’s speed, so is there anyway to avoid both OOM and benchmarking every time? (My batch size is 1)