Leaky Relu in CuDNN

avyz · April 1, 2021, 5:13pm

Hi,

The pytorch pre-trained DNN that I am following uses leaky RELU as an activation function in its layers. I am building the inference network on local machine using CuDNN which doesn’t seem to support leaky RELU as an activation function.

Please share any ideas / examples on how to use leaky RELU in CuDNN?

Thanks!

ptrblck · April 2, 2021, 6:16am

You could either write a custom kernel (or take a look at what PyTorch might be using internally) and/or create a feature request for cudnn in their support board.

avyz · April 2, 2021, 1:46pm

Thanks @ptrblck

I did collect the log for the network so as to see what activation function Pytorch is using internally, but it doesn’t provide any information about the activation layer. However, I also verified that the other network which uses only RELU activation, its activation details were available. So, not sure why it hided leaky RELU’s activation detail. Also, I am using Google Colab, so not sure if that’s causing any bottleneck.

Regarding creating own kernel to implement leaky RELU, does it have any performance impact? Does CuDNN creates any internal pipelines or buffers to speed up the process and introducing own kernel will impact it?

ptrblck · April 2, 2021, 6:49pm

I’m not sure what kind of logging you are looking at and where it’s showing the activation and where it doesn’t. Could you explain it a bit more?

I was thinking about reusing PyTorch’s internal kernel.
Since there is no leaky ReLU kernel in cudnn, there wouldn’t be any performance impact, as there wouldn’t be a cudnn baseline to compare against.

avyz · April 3, 2021, 12:25am

@ptrblck I am using cudnn API logging to collect the log files. Previously, I designed LeNet for hand written digit classification in Pytorch, which uses RELU in all its layers and for that I can see the activation function logged in.
Here is a snippet of log file showing RELU activation function-

I! CuDNN (v7605) function cudnnActivationForward() called:
i! handle: type=cudnnHandle_t; streamId=0000000000000000 (defaultStream);
i! activationDesc: type=cudnnActivationDescriptor_t:
i! coef: type=double; val=0.000000;
i! mode: type=cudnnActivationMode_t; val=CUDNN_ACTIVATION_RELU (1);
i! reluNanOpt: type=cudnnNanPropagation_t; val=CUDNN_PROPAGATE_NAN (1);
i! alpha: type=CUDNN_DATA_FLOAT; val=1.000000;
i! srcDesc: type=cudnnTensorDescriptor_t:
i! dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i! nbDims: type=int; val=4;
i! dimA: type=int; val=[1,500,1,1];
i! strideA: type=int; val=[500,1,1,1];
i! srcData: location=dev; addr=000000070D060E00;
i! beta: type=CUDNN_DATA_FLOAT; val=0.000000;
i! destDesc: type=cudnnTensorDescriptor_t:
i! dataType: type=cudnnDataType_t; val=CUDNN_DATA_FLOAT (0);
i! nbDims: type=int; val=4;
i! dimA: type=int; val=[1,500,1,1];
i! strideA: type=int; val=[500,1,1,1];
i! destData: location=dev; addr=000000070D061600;
i! Time: 2021-03-30T10:41:47.544918 (0d+0h+0m+4s since start)
i! Process=40584; Thread=38684; GPU=0; Handle=0000022AB3F7E7E0; StreamId=0000000000000000 (defaultStream).

However, currently I am working on UNet architecture, which uses leaky RELU in all its layers. Hence, to know how Pytorch implements leaky RELU given CuDNN doesn’t support it, I collected the log files again using cudnn API logging, but this time the log files doesn’t contain any information about the activation function.

Given leaky RELU kernel implementation is simple, I was thinking about writing my own kernel. But, you mentioned about reusing Pytorch’s internal kernel. Can you please suggest / refer to how to get access to Pytorch’s internal kernels?

Thanks!

ptrblck · April 3, 2021, 12:35am

It’s called from here using the TensorIterator.