Hello!
As i understand it “torch.backends.cudnn.benchmark” benchmarks multiple convolution algorithms during the first epoch to then uses the fastest during subsequent epochs. If i checkpoint my model and then resume it, cudnn has to rerun the benchmark again for the first epoch of the resumed run. Is there away to save the results from the benchmark in epoch 1 and then load that result when resuming training?
Theoretically since it’s continuing on the same model, shouldn’t the same convolution algorithms be the most optimal when resuming training?