Hi,
I would like to change the gradient calculation for one of the layers, depending on where the gradient is coming from. Is there an easy way to do so?
Here is a concrete example of what I want to achieve:
I have a layer (A) connected to two other layers (B, C). The gradient of the initial layer (A) will be equal to the sum of the gradients coming from the two other layers (B, C), however I would like the gradient coming from one of the other layers (B) to have less of an impact on the first layer. To do so I want to multiply it by 0.9 before the addition.
So the code would look something like this:
class MyModel(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(1, 2)
self.fc2 = nn.Linear(2, 1)
self.fc3 = nn.Linear(2, 1)
def forward(self, x):
x = self.fc1(x)
x = F.relu(x)
y = self.fc2(x)
y = F.relu(y)
z = self.fc3(x)
z = F.relu(z)
x = y+z
return x
I know that I could use a hook, but this seems to only change the whole gradient after the summation, so I don’t think that would work. I also obviously want the gradient of other layers to remain unaffected when taking training step, so changing the gradient of one of the other layers completely is not a good solution.