Enabling cuDNN slows down training process

zachnguyen03 · May 10, 2022, 5:18am

Hi! I am trying to train a model using Pytorch and I found that enabling cuDNN would speed up the training process. However, training got significantly slowed down instead. Can anyone help me with this issue? I performed some minor tests and got this result:

Enable Cudnn
Conv:  0.5449938774108887
Conv backward:  0.1148219108581543
---------------------------
Disable Cudnn
Conv:  0.01213216781616211
Conv backward:  0.13735365867614746
---------------------------

Here is the code snippet that I used for testing:

import torch
import time
def run():
    in_c = 10
    out_c = 15
    kernel = 3
    padding = 1
    inp = torch.rand(512, in_c, 128, 128, requires_grad=True).cuda()
    conv = torch.nn.Conv2d(in_c, out_c, kernel, padding=padding, bias=False).cuda()
    torch.cuda.synchronize()
    start = time.time()
    out = conv(inp)
    torch.cuda.synchronize()
    print("Conv: ", time.time() - start)
    start = time.time()
    out.sum().backward()
    torch.cuda.synchronize()
    print("Conv backward: ", time.time() - start)
    print('---------------------------')
if __name__=='__main__':
    torch.backends.cudnn.enabled=True
    print('Enable Cudnn')
    run()
    torch.backends.cudnn.enabled=False
    print('Disable Cudnn')
    run()

ptrblck · May 17, 2022, 9:01pm

I cannot reproduce the issue using warmup iterations and a mean over multiple profiling iterations.
I would also recommend to use torch.utils.benchmark to profile CUDA workloads, as it would add warmup iterations, add synchronizations, and execute the workload until a time threshold is met.

Results in a 3090 using the latest nightly:

Enable Cudnn
Conv:  0.35583066940307617
Conv backward:  14.73327088356018
---------------------------
Disable Cudnn
Conv:  1.5492074489593506
Conv backward:  16.762986421585083
---------------------------