Add random layer output as part of final loss

Hi
I have a loss function that looks something like this

loss = logits + lambda * get_value_of_random_fc_neuron(neuron_index)

I use forward hooks to get the random neuron value in get_value_of_random_fc_neuron()

The issue is the gradients are same if

  1. loss = logits or
  2. loss = logits + lambda * get_value_of_random_fc_neuron(neuron_index)

This is because the value of neuron is a tensor without grad_fn assigned. How do i make sure the gradient is calculated for full loss.

Thanks in advance

Could you post the code you are using to select the neuron?
This dummy example seems to work:

lin = nn.Linear(2, 2, bias=False)
x = torch.randn(1, 2)
out = lin(x)
loss = out.mean()
loss.backward()
print(lin.weight.grad)

lin.zero_grad()
out = lin(x)
loss = out.mean() + lin.weight[0, 0]
loss.backward()
print(lin.weight.grad)

I haven’t tested it on more complicated models, so let me know, what the difference between your code and the example is.

Hi @ptrblck

Sorry for the confusion. I m looking at model activations and not the model weights.
Heres a sample code of what i m trying to do.

Case 1

class model(nn.Module):
        def __init__():
              #init model
       def forward(self, x):
             x = self.conv1(x)
             x = self.fc1(x)
            return x


def pick_random_neuron(layer_list, index_list):
    # naive approach. Assume layer list and index list are just plain lists with layer names and corresponding list of number of neurons
    layer_name, index = random.choice((layer_list, index_list))

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()


model.eval()
input.requires_grad = True # i want to modify input with gradients

for name, layer in model.named_modules():
    handle = layer.register_forward_hook(get_activation(name))

output = model(input)
layer_name, index= pick_random_neuron(layer_list, index_list)

#get activation value, assume conv and fc layers are reshaped to 1d vector

value = activation[layer_name][index]
some_loss = nn.Criterion(output, target)
loss = some_loss + value 
model.zero_grad()
loss.backward()

data_grad = input.grad.data

Since value is taken from forward hook, there is no grad_fn and value does not play a role in loss and grads calculation for finding gradients to change data. This basically ends up as adding some scalar value at the end to the loss function.

I can solve this problem by changing model forward to output all layers.

Case 2


class model(nn.Module):
    def __init__(self):
        # model init
    def forward(self, x):
        x1 = self.conv1(x)
        x2 = self.fc1(x1)
        return x1, x2

x1, x2 = model(input)
layer_name, index= pick_random_neuron(layer_list, index_list)
if layer_name == 'Conv1':
    x1.view(-1)[index]
elif layer_name == 'fc1':
    x2.view(-1)[index]

Case 2 works. But is there a possiblity to solve this problem using forward hooks as they are more flexible for larger models?

Thanks for the update!
Your forward hook detaches the output.
Could you remove the detach() call and try it again?

Hey @ptrblck. Thanks a lot. That worked.