Is it possible to interact with CUDNN API from Pytorch. Following function returns the type of algorithm to be used defined by CUDNN_CONVOLUTION_FWD_PREFER_FASTEST:
If your input always have the same size, you should enable it all the time.
But it will influence which algorithm cudnn is using only while the flag is enabled. So setting it during training does not influence inference in any way.
@fmassa@ptrblck
Hello. I would like to ask a few questions about the behavior of torch.backends.cudnn.benchmark = True.
Does the mini-batch size matter? Many people say that benchmarking uses the same cache if image input size is the same. However, I have not found a clear explanation of whether changing batch size is OK.
How many caches can it manage? For example, I might have two types of input: 224x224 and 320x320. Would changing between the two types of images constantly require additional benchmarking or would there be two separate caches?
Yes, the batch size matters, as the ConvolutionParams will be stored from here.
Itās using a std::unordered_map with the mentioned ConvolutionParams, so no additional benchmarking would be required for these two shapes once they are already profiled.
hello guys, I have a quick question about the torch.backends.cudnn.benchmark = True
When you say the input_size cannot change, does that apply to each convolution layer?
I have a UNet design using dense blocks. Since in a block, input for each layer is different, does that mean I cannot use torch.backends.cudnn.benchmark = True ?
Is there any workaround for dense block so that I can use torch.backends.cudnn.benchmark = True ?
The input shape can change, but each new input shape will rerun cudnnFind to find the fastest kernel for this shape (for all layers with a new input shape) and will add these kernels to a cache.
No, you can use it, but each new input shape will cause a slowdown once.