RuntimeError: Function CudnnConvolutionBackward returned an invalid gradient at index 0 - got [1, 3, 134, 134] but expected shape compatible with [1, 3, 112, 112]

r_khanna · May 21, 2020, 8:20am

I am trying to visualize CNN(Vgg-16) by optimizing a random input image so as to maximize the activation of a given channel in a given layer. But after every certain number of iterations, i want to upsample the image as well. After the first time upsampling happens, this error is presented -

RuntimeError: Function CudnnConvolutionBackward returned an invalid gradient at index 0 - got [1, 3, 67, 67] but expected shape compatible with [1, 3, 56, 56]

Here 56x56 is the original image size and 67x67 is size after upscaling.

Here is my code :
Note :

the forward method of model returns a dictionary with a layer’s index as key and layer’s output as the value. The layers in output are the relu layers.
n_channels is the number of channels from each layer we wish to vizualize

layers = [29]
learning_rate = [.01]
n_channels = 1
iterations = 300
afterEvery = 50
start = afterEvery
upto = iterations
img_size = 56
random_low = 150
random_high = 180
upscaleFactor = 1.2
upscale = nn.Upsample(scale_factor=upscaleFactor,mode='bicubic')

for rate in learning_rate:
    for layer in layers:
        for ch_i in range(n_channels):
            image = (((random_high - random_low)*torch.rand(1,3,img_size,img_size) + random_low)/255).cuda()
            img = (image - mean)/std
            img.requires_grad = True
            optimizer = optim.Adam([img],lr=rate)
            out = model(img)
            channel = torch.randint(out[layer].size()[1],(1,))
            plt.figure(figsize=(30,70))

            for i in range(iterations):
                if(i!=0) :
                    out = model(img)
                kernel = out[layer][0][channel]

                loss = -kernel.mean()
                
                if loss == 0 and i==0:
                    print('Finding non-zero loss kernel')
                    while loss == 0:
                        channel = torch.randint(out[layer].size()[1],(1,))
                        kernel = out[layer][0][channel]
                        loss = -kernel.mean()
                        print('Tried channel index - ',channel.item(),' Loss - ',loss.item())

                optimizer.zero_grad()
                loss.backward()
                optimizer.step()

                if i%afterEvery == 0 or i==iterations-1:
                    print('Learning rate : ',rate,'\tLayer : ',layer,'\tChannel : ',channel.item(),'\titeration : ',i,' \tLoss : ',-loss.item())
                    print(img.size()[2])
                    plt.subplot(7,3,math.ceil(i/afterEvery)+1)
                    plt.imshow(transforms.ToPILImage()((img*std + mean).cpu()[0]))
                    if i >= start and loss != 0 and i <= upto:
                        img.data=upscale(img).data
                        print(img.size())
                        
                kernel=kernel.detach()
                for j in out:
                    out[j]=out[j].detach()
                    
            plt.show()

I don’t know why is it still expecting gradient to be of shape [1, 3, 56, 56]. Any help would be appreciated.
Thanks in advance

ptrblck · May 21, 2020, 9:45am

You shouldn’t use the .data attribute as it might yield these kind of unwanted side effects.
Try to remove the .data usage and check, if Autograd raises any errors.

r_khanna · May 21, 2020, 11:13am

In that case, if I do img = upscale(img) , then the loss doesn’t change after the image is upscaled. img doesn’t remain a leaf node anymore, I check this with img.is_leaf and it returned False. I also checked img.grad and it returned None. I used img.data = upscale(img).data to overcome this hurdle.

Can you please suggest how I may implement it?

albanD · May 21, 2020, 2:59pm

Hi,

As a general rule, .data is never the solution you’re looking for Even though it can fix your issue, it is very likely to break many other things

Here you just want the python variable “img” to point to a new leaf that is built based on the upscaled image:

img = upscale(img).detach() # Detach from history to make it a leaf
img.requires_grad_() # Make it require gradients again

r_khanna · May 23, 2020, 11:34am

Thank you @albanD. That worked