I am trying to visualize CNN(Vgg-16) by optimizing a random input image so as to maximize the activation of a given channel in a given layer. But after every certain number of iterations, i want to upsample the image as well. After the first time upsampling happens, this error is presented -
RuntimeError: Function CudnnConvolutionBackward returned an invalid gradient at index 0 - got [1, 3, 67, 67] but expected shape compatible with [1, 3, 56, 56]
Here 56x56 is the original image size and 67x67 is size after upscaling.
Here is my code :
- the forward method of model returns a dictionary with a layer’s index as key and layer’s output as the value. The layers in output are the relu layers.
- n_channels is the number of channels from each layer we wish to vizualize
layers =  learning_rate = [.01] n_channels = 1 iterations = 300 afterEvery = 50 start = afterEvery upto = iterations img_size = 56 random_low = 150 random_high = 180 upscaleFactor = 1.2 upscale = nn.Upsample(scale_factor=upscaleFactor,mode='bicubic') for rate in learning_rate: for layer in layers: for ch_i in range(n_channels): image = (((random_high - random_low)*torch.rand(1,3,img_size,img_size) + random_low)/255).cuda() img = (image - mean)/std img.requires_grad = True optimizer = optim.Adam([img],lr=rate) out = model(img) channel = torch.randint(out[layer].size(),(1,)) plt.figure(figsize=(30,70)) for i in range(iterations): if(i!=0) : out = model(img) kernel = out[layer][channel] loss = -kernel.mean() if loss == 0 and i==0: print('Finding non-zero loss kernel') while loss == 0: channel = torch.randint(out[layer].size(),(1,)) kernel = out[layer][channel] loss = -kernel.mean() print('Tried channel index - ',channel.item(),' Loss - ',loss.item()) optimizer.zero_grad() loss.backward() optimizer.step() if i%afterEvery == 0 or i==iterations-1: print('Learning rate : ',rate,'\tLayer : ',layer,'\tChannel : ',channel.item(),'\titeration : ',i,' \tLoss : ',-loss.item()) print(img.size()) plt.subplot(7,3,math.ceil(i/afterEvery)+1) plt.imshow(transforms.ToPILImage()((img*std + mean).cpu())) if i >= start and loss != 0 and i <= upto: img.data=upscale(img).data print(img.size()) kernel=kernel.detach() for j in out: out[j]=out[j].detach() plt.show()
I don’t know why is it still expecting gradient to be of shape [1, 3, 56, 56]. Any help would be appreciated.
Thanks in advance