# How backward works for two dependent forward pass

I have to implement two dependent forward pass, I am wondering how the backward works for this case

``````    x = F.relu(self.lin1(x))
x = F.relu(self.lin2(x))
out1 = self.prop1(x)
out2 = self.prop2(x)
out1 = F.log_softmax(out1, dim=1)
out2 = F.log_softmax(out2, dim=1)
``````

return out1, out2

out1 and out2 share the same input x, will loss.backward() work for this case? Thanks.

`loss` is undefined in your case, but e.g. if you are summing `outX` to `loss`, `loss.backward()` will work and will use both `outX` tensors during the backpropagation.

loss1 = F.cross_entropy(out1, label)
loss2 = F.cross_entropy(out2, label)
(loss1+loss2).backward()

I still have a question about back-propagation, if I want to get the gradient of x, I defined a backward hook:
x = F.relu(self.lin1(x))
x = F.relu(self.lin2(x))
#some operations
x.register_hook(backward_hook)
out1 = self.prop1(x)
out2 = self.prop2(x)
out1 = F.log_softmax(out1, dim=1)
out2 = F.log_softmax(out2, dim=1)
return out1, out2

for loss:
loss1 = F.cross_entropy(out1, label)
loss2 = F.cross_entropy(out2, label)
(loss1+loss2).backward()

Will backward prop follow the different forward path to compute the two separate gradient? If so, the grad in the backward hook in the above code is for out1 or out2? Thanks for your help.

I think the gradient hook will be called after the gradient accumulation is done, so if `x` is used in both outputs, its `grad` will be calculated from both losses as well.

Thanks for your help. I am wondering if I can get the two gradients of x for both two forward pass separately in that backward hook. In other word, if there is just one forward pass, the grad of x is just for that forward pass, but if I use two forward pass like above, how can I separate and get gradients in the backward hook? Thanks so much.