Add random layer output as part of final loss

blenderender · April 26, 2020, 9:49pm

Hi
I have a loss function that looks something like this

loss = logits + lambda * get_value_of_random_fc_neuron(neuron_index)

I use forward hooks to get the random neuron value in get_value_of_random_fc_neuron()

The issue is the gradients are same if

loss = logits or
loss = logits + lambda * get_value_of_random_fc_neuron(neuron_index)

This is because the value of neuron is a tensor without grad_fn assigned. How do i make sure the gradient is calculated for full loss.

Thanks in advance

ptrblck · April 27, 2020, 2:16am

Could you post the code you are using to select the neuron?
This dummy example seems to work:

lin = nn.Linear(2, 2, bias=False)
x = torch.randn(1, 2)
out = lin(x)
loss = out.mean()
loss.backward()
print(lin.weight.grad)

lin.zero_grad()
out = lin(x)
loss = out.mean() + lin.weight[0, 0]
loss.backward()
print(lin.weight.grad)

I haven’t tested it on more complicated models, so let me know, what the difference between your code and the example is.

blenderender · April 27, 2020, 9:30am

Hi @ptrblck

Sorry for the confusion. I m looking at model activations and not the model weights.
Heres a sample code of what i m trying to do.

Case 1

class model(nn.Module):
        def __init__():
              #init model
       def forward(self, x):
             x = self.conv1(x)
             x = self.fc1(x)
            return x


def pick_random_neuron(layer_list, index_list):
    # naive approach. Assume layer list and index list are just plain lists with layer names and corresponding list of number of neurons
    layer_name, index = random.choice((layer_list, index_list))

activation = {}
def get_activation(name):
    def hook(model, input, output):
        activation[name] = output.detach()


model.eval()
input.requires_grad = True # i want to modify input with gradients

for name, layer in model.named_modules():
    handle = layer.register_forward_hook(get_activation(name))

output = model(input)
layer_name, index= pick_random_neuron(layer_list, index_list)

#get activation value, assume conv and fc layers are reshaped to 1d vector

value = activation[layer_name][index]
some_loss = nn.Criterion(output, target)
loss = some_loss + value 
model.zero_grad()
loss.backward()

data_grad = input.grad.data

Since value is taken from forward hook, there is no grad_fn and value does not play a role in loss and grads calculation for finding gradients to change data. This basically ends up as adding some scalar value at the end to the loss function.

I can solve this problem by changing model forward to output all layers.

Case 2


class model(nn.Module):
    def __init__(self):
        # model init
    def forward(self, x):
        x1 = self.conv1(x)
        x2 = self.fc1(x1)
        return x1, x2

x1, x2 = model(input)
layer_name, index= pick_random_neuron(layer_list, index_list)
if layer_name == 'Conv1':
    x1.view(-1)[index]
elif layer_name == 'fc1':
    x2.view(-1)[index]

Case 2 works. But is there a possiblity to solve this problem using forward hooks as they are more flexible for larger models?

ptrblck · April 27, 2020, 9:28pm

Thanks for the update!
Your forward hook detaches the output.
Could you remove the detach() call and try it again?

blenderender · May 2, 2020, 1:13pm

Hey @ptrblck. Thanks a lot. That worked.