RuntimeError: Function CudnnConvolutionBackward returned an invalid gradient at index 0 - got [1, 3, 134, 134] but expected shape compatible with [1, 3, 112, 112]

I am trying to visualize CNN(Vgg-16) by optimizing a random input image so as to maximize the activation of a given channel in a given layer. But after every certain number of iterations, i want to upsample the image as well. After the first time upsampling happens, this error is presented -

RuntimeError: Function CudnnConvolutionBackward returned an invalid gradient at index 0 - got [1, 3, 67, 67] but expected shape compatible with [1, 3, 56, 56]

Here 56x56 is the original image size and 67x67 is size after upscaling.

Here is my code :
Note :

  1. the forward method of model returns a dictionary with a layer’s index as key and layer’s output as the value. The layers in output are the relu layers.
  2. n_channels is the number of channels from each layer we wish to vizualize
layers = [29]
learning_rate = [.01]
n_channels = 1
iterations = 300
afterEvery = 50
start = afterEvery
upto = iterations
img_size = 56
random_low = 150
random_high = 180
upscaleFactor = 1.2
upscale = nn.Upsample(scale_factor=upscaleFactor,mode='bicubic')

for rate in learning_rate:
    for layer in layers:
        for ch_i in range(n_channels):
            image = (((random_high - random_low)*torch.rand(1,3,img_size,img_size) + random_low)/255).cuda()
            img = (image - mean)/std
            img.requires_grad = True
            optimizer = optim.Adam([img],lr=rate)
            out = model(img)
            channel = torch.randint(out[layer].size()[1],(1,))

            for i in range(iterations):
                if(i!=0) :
                    out = model(img)
                kernel = out[layer][0][channel]

                loss = -kernel.mean()
                if loss == 0 and i==0:
                    print('Finding non-zero loss kernel')
                    while loss == 0:
                        channel = torch.randint(out[layer].size()[1],(1,))
                        kernel = out[layer][0][channel]
                        loss = -kernel.mean()
                        print('Tried channel index - ',channel.item(),' Loss - ',loss.item())


                if i%afterEvery == 0 or i==iterations-1:
                    print('Learning rate : ',rate,'\tLayer : ',layer,'\tChannel : ',channel.item(),'\titeration : ',i,' \tLoss : ',-loss.item())
                    plt.imshow(transforms.ToPILImage()((img*std + mean).cpu()[0]))
                    if i >= start and loss != 0 and i <= upto:
                for j in out:

I don’t know why is it still expecting gradient to be of shape [1, 3, 56, 56]. Any help would be appreciated.
Thanks in advance :wink:

You shouldn’t use the .data attribute as it might yield these kind of unwanted side effects.
Try to remove the .data usage and check, if Autograd raises any errors.

In that case, if I do img = upscale(img) , then the loss doesn’t change after the image is upscaled. img doesn’t remain a leaf node anymore, I check this with img.is_leaf and it returned False. I also checked img.grad and it returned None. I used = upscale(img).data to overcome this hurdle.

Can you please suggest how I may implement it?


As a general rule, .data is never the solution you’re looking for :smiley: Even though it can fix your issue, it is very likely to break many other things :wink:

Here you just want the python variable “img” to point to a new leaf that is built based on the upscaled image:

img = upscale(img).detach() # Detach from history to make it a leaf
img.requires_grad_() # Make it require gradients again

Thank you @albanD. That worked :smile: