What algorithm does pytorch use without the use of backends.cudnn?

Shubhankar · January 16, 2020, 5:06pm

What algorithm does pytorch use if torch.backends.cudnn.benchmark = False is set.

albanD · January 16, 2020, 5:11pm

Hi,

It used our own implementation.
You can find the cuda one here for example.

Shubhankar · January 16, 2020, 5:29pm

@albanD Btw when I set torch.backends.cudnn.benchmark = False
I still get 3 calls to FFT algorithm (1 of the 4 calls) as shown in the following snippet from CUDNN logs which are a lot fewer (vs. 39) than using torch.backends.cudnn.benchmark = True. What could be causing it?

I! CuDNN (v7601) function cudnnConvolutionForward() called:
i!     handle: type=cudnnHandle_t; streamId=(nil) (defaultStream);
i!     alpha: type=CUDNN_DATA_FLOAT; val=1.000000;
i!     xDesc: type=cudnnTensorDescriptor_t:
i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!         nbDims: type=int; val=4;
i!         dimA: type=int; val=[18432,6,14,14];
i!         strideA: type=int; val=[1176,196,14,1];
i!     xData: location=dev; addr=0x200154c00000;
i!     wDesc: type=cudnnFilterDescriptor_t:
i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!         vect: type=int; val=0;
i!         nbDims: type=int; val=4;
i!         dimA: type=int; val=[16,6,5,5];
i!         format: type=cudnnTensorFormat_t; val=CUDNN_TENSOR_NCHW (0);
i!     wData: location=dev; addr=0x200118400a00;
i!     convDesc: type=cudnnConvolutionDescriptor_t:
i!         mode: type=cudnnConvolutionMode_t; val=CUDNN_CROSS_CORRELATION (1);
i!         dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i!         mathType: type=cudnnMathType_t; val=CUDNN_DEFAULT_MATH (0);
i!         reorderType: type=int; val=0;
i!         arrayLength: type=int; val=2;
i!         padA: type=int; val=[0,0];
i!         strideA: type=int; val=[1,1];
i!         dilationA: type=int; val=[1,1];
i!         groupCount: type=int; val=1;
i!     algo: type=cudnnConvolutionFwdAlgo_t; val=CUDNN_CONVOLUTION_FWD_ALGO_FFT (4);
i!     workSpace: location=dev; addr=0x2001c0000000;
i!     workSpaceSizeInBytes: type=size_t; val=679486848;
i!     beta: type=CUDNN_DATA_FLOAT; val=0.000000;
i!     yDesc: type=cudnnTensorDescriptor_t:
i!         dataType: type=cudnnDataType_t; val=**CUDNN_DATA_FLOAT (0)**;
i!         nbDims: type=int; val=4;
i!         dimA: type=int; val=[18432,16,10,10];
i!         strideA: type=int; val=[1600,100,10,1];
i!     yData: location=dev; addr=0x2000a3800000;
i! Time: 2020-01-16T12:23:07.839722 (0d+0h+0m+25s since start)
i! Process=86375; Thread=86375; GPU=0; Handle=0x13e4302b0; StreamId=(nil) (defaultStream).

albanD · January 16, 2020, 5:31pm

Do you have a small code sample that reproduces this?

cc @ptrblck

Shubhankar · January 16, 2020, 5:55pm

I am using pytorch doc: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#training-an-image-classifier with and without torch.backends.cudnn.benchmark and with

export CUDNN_LOGINFO_DBG=1
export CUDNN_LOGDEST_DBG=log.txt

ptrblck · January 16, 2020, 7:33pm

cuDNN can still be used, if torch.backends.cudnn.enabled = True.
If you don’t want to use cudnn, you should set this flag to False to use the native PyTorch methods.

When cudnn.benchmark is set to True, the first iterations will get a slowdown, as some internal benchmarking is done to get the fastest kernels for your current workload, which would explain the additional function calls you are seeing.

Shubhankar · January 16, 2020, 9:35pm

I get that but what I observed was CUDNN logs still had some CUDNN algorithm calls even though I had used torch.backends.cudnn.benchmark=False like as shown What algorithm does pytorch use without the use of backends.cudnn?