Hi!
I have prototyped a convolutional autoencoder with two distinct sets of weights for the encoder (with parameters w_f) and for the decoder (w_b). I have naturally used nn.Conv2d
and nn.ConvTranspose2d
to build the encoder and decoder respectively. The rough context of study is on the one hand to learn w_f so that it minimizes a loss function defined at the output, and on the other hand to learn w_b so that it matches w_f^T. The forward pass typically reads as a convolution followed by a transpose convolution. So far, everything works out well.
Now I want to accelerate this code using the cuDNN-accelerated C++ functions cudnn_convolution
, cudnn_convolution_backward_weight
and cudnn_convolution_backward_input
(GitHub - jordan-g/PyTorch-cuDNN-Convolution: PyTorch extension enabling direct access to cuDNN-accelerated C++ convolution functions.). I am writing my own conv layer (inheriting from torch.autograd.Function
), defining its forward
and backward
methods by hand. In the forward method, we have a cudnn_convolution
operation (parametrized by w_f) followed by a cudnn_convolution_backward_input
operation (parametrized by w_b). In the backward method, in order to compute the gradient of the loss with respect to w_b, I therefore need to backpropagate through cudnn_convolution_backward_input
. I thought I could simply use cudnn_convolution_backward_weight
, but it does not work. What is the good way to proceed to compute the gradient of w_b? I can provide the code of a concrete toy problem reproducing this issue if necessary. Many thanks for reading!