Hi
I have a loss function that looks something like this
loss = logits + lambda * get_value_of_random_fc_neuron(neuron_index)
I use forward hooks to get the random neuron value in get_value_of_random_fc_neuron()
The issue is the gradients are same if
- loss = logits or
- loss = logits + lambda * get_value_of_random_fc_neuron(neuron_index)
This is because the value of neuron is a tensor without grad_fn assigned. How do i make sure the gradient is calculated for full loss.
Thanks in advance
Could you post the code you are using to select the neuron?
This dummy example seems to work:
lin = nn.Linear(2, 2, bias=False)
x = torch.randn(1, 2)
out = lin(x)
loss = out.mean()
loss.backward()
print(lin.weight.grad)
lin.zero_grad()
out = lin(x)
loss = out.mean() + lin.weight[0, 0]
loss.backward()
print(lin.weight.grad)
I haven’t tested it on more complicated models, so let me know, what the difference between your code and the example is.
Hi @ptrblck
Sorry for the confusion. I m looking at model activations and not the model weights.
Heres a sample code of what i m trying to do.
Case 1
class model(nn.Module):
def __init__():
#init model
def forward(self, x):
x = self.conv1(x)
x = self.fc1(x)
return x
def pick_random_neuron(layer_list, index_list):
# naive approach. Assume layer list and index list are just plain lists with layer names and corresponding list of number of neurons
layer_name, index = random.choice((layer_list, index_list))
activation = {}
def get_activation(name):
def hook(model, input, output):
activation[name] = output.detach()
model.eval()
input.requires_grad = True # i want to modify input with gradients
for name, layer in model.named_modules():
handle = layer.register_forward_hook(get_activation(name))
output = model(input)
layer_name, index= pick_random_neuron(layer_list, index_list)
#get activation value, assume conv and fc layers are reshaped to 1d vector
value = activation[layer_name][index]
some_loss = nn.Criterion(output, target)
loss = some_loss + value
model.zero_grad()
loss.backward()
data_grad = input.grad.data
Since value
is taken from forward hook, there is no grad_fn and value
does not play a role in loss and grads calculation for finding gradients to change data. This basically ends up as adding some scalar value at the end to the loss function.
I can solve this problem by changing model forward to output all layers.
Case 2
class model(nn.Module):
def __init__(self):
# model init
def forward(self, x):
x1 = self.conv1(x)
x2 = self.fc1(x1)
return x1, x2
x1, x2 = model(input)
layer_name, index= pick_random_neuron(layer_list, index_list)
if layer_name == 'Conv1':
x1.view(-1)[index]
elif layer_name == 'fc1':
x2.view(-1)[index]
Case 2 works. But is there a possiblity to solve this problem using forward hooks as they are more flexible for larger models?
Thanks for the update!
Your forward hook detaches the output.
Could you remove the detach()
call and try it again?
Hey @ptrblck. Thanks a lot. That worked.