Cam grad CNN+LSTM

Zrufy · July 10, 2020, 11:52am

I’m tryng to create a cam grand from my model CNN+LSTM.
I take my first part of the model and the second to pass in this function

def GradCAM(img, c, features_fn, classifier_fn):

    feats = features_fn(img.cuda()) # [1, 2048, 7, 7]

    _, N, H, W = feats.size()

    out = features_fn(feats)

    c_score = out[0, c] # output value of class c

    grads = torch.autograd.grad(c_score, feats) # get gradient map (grads[0][0])

                                                # [2048, 7, 7]

    w = grads[0][0].mean(-1).mean(-1) # GAP of grads

                                      # [2048]

    sal = torch.matmul(w, feats.view(N, H*W)) # feats.view(N, H*W) -> [2048, 49]

    sal = sal.view(H, W).cpu().detach().numpy()

    sal = np.maximum(sal, 0)

    return sal

But when i go to run the model i have this error

RuntimeError: Given groups=1, weight of size 24 1 5 5, expected input[1, 512, 1, 26] to have 1 channels, but got 512 channels instead

ptrblck · July 12, 2020, 2:41am

Is your model working generally without splitting it?
The error points towards a shape mismatch in a conv layer. If the model is working in the original form, make sure to apply all functional API calls, such as flattening the activation tensors etc.

Zrufy · July 16, 2020, 12:14pm

for an inference I don’t need to split the model. I wanted to see where the convolutive part was concentrated with a cam-grad. So I wanted to take the first part of the model and see the results.