Gradient of activation arrays

Roos_Kraaijveld · April 28, 2022, 4:19pm

I would like to get the directional derivative of a concept using TCAV (tcav/tcav.py at master · rakhimovv/tcav · GitHub).

In order to do this, I wrote the following code:


  model.eval()
    tcav = {}
    for ind, (img, label) in enumerate(loader):
        img.requires_grad=True
        img = img.to(device, dtype=torch.float)
        
        output = model(img)
        
        layer_activation = activation[L] #activation array of layer L
        gradients = torch.autograd.grad(outputs=layer_activation, inputs=img,
                          grad_outputs=torch.ones(layer_activation.size()).to(device),
                          create_graph=True, retain_graph=True, only_inputs=True)[0]
        
        
        grads = normalize(gradients.cpu().detach().numpy().ravel())
        
    
    return grads

Afterwards I need to take the dot product of the two vectors like in the code from Kim:


 def get_direction_dir_sign(mymodel, act, cav, concept, class_id):
        """Get the sign of directional derivative.
        Args:
            mymodel: a model class instance
            act: activations of one bottleneck to get gradient with respect to.
            cav: an instance of cav
            concept: one concept
            class_id: index of the class of interest (target) in logit layer.
        Returns:
            sign of the directional derivative
        """
        # Grad points in the direction which DECREASES probability of class
        grad = np.reshape(mymodel.get_gradient(act, [class_id], cav.bottleneck), -1)
        dot_prod = np.dot(grad, cav.get_direction(concept))
        return dot_prod < 0

However, when I take the dot product, my vectors are not the same size. One vector is the size of my image and the other is the size of my activation array for that specific layer. What am I doing wrong?