# Activation Maximization for TCAV in Pytorch

I am trying to calculate the TCAV vectors for my model, for which I need to do the following thing:

def compute_tcav();
losses = [
(ActivationMaximization(model.layers[layer_idx], filter_indices), -1)
]
opt = Optimizer(input_tensor, losses, wrt_tensor=wrt_tensor, norm_grads=False)

I am using the amu package to compute activation maximization (GitHub - Nguyen-Hoa/Activation-Maximization: Python implementation of activation maximization with PyTorch.)

In this code, I need to get the gradient of the activation array for a particular layer wrt the target using the code below:

``````    # Propogate image through network,
# then access activation of target layer
network(input)
layer_out = layer_activation[layer_name]

# compute gradients w.r.t. target unit,
# then access the gradient of input (image) w.r.t. target unit (neuron)
layer_out[unit].backward(retain_graph=True)
``````

However, when I get my layer_out, it has a size of: torch.Size([1, 2, 97, 97, 97]) and therefore I cannot compute the layer_out[unit].backward(retain_graph=True).

How can I solve this?

`tensor.backward()` will populate the gradient with a scalar `1` value, if `tensor` is also a scalar tensor.
If `tensor` contains more than a single element, you would either need to pass the gradient explicitly to `backward` (e.g. via `tensor.backward(gradient=torch.ones_like(tensor))`) or you could reduce the tensor first e.g. via `tensor.mean().backward()`.

Okay! That makes sense! I have now changed the code to the following:

``````tcav = {}
for ind, (img, label) in enumerate(loader):
img = img.to(device, dtype=torch.float)

output = model(img)

layer_activation =  activation[layer_names].cpu()

loss = torch.mean(layer_activation)
loss.backward(retain_graph=True)

tcav[ind] = {}
``````

This, I believe, still follows the activation maximisation code (GitHub - Nguyen-Hoa/Activation-Maximization: Python implementation of activation maximization with PyTorch.

However, when I run this, img.grad is None. I understand that it’s not a leaf tensor, but I’m not quite sure how to then compute the image gradient.

The `to()` operation is differentiable and thus creates a non-leaf tensor:

``````img = torch.randn(1, 1)
print(img.is_leaf)
# True

print(img.is_leaf)
# True

img = img.to('cuda', dtype=torch.float)
print(img.is_leaf)
# False
Move the tensor to the `device` and `dtype` first before setting the tensor’s `.requires_grad` attribute to `True`:
``````img = torch.randn(1, 1)