I am doing with a network with multiple outputs. I have read this topic: How to do backward() for a net with multiple outputs?. But I still don’t understand how to do it. Do you guys have any clear tutorials/examples how to do it?

Thanks,

I am doing with a network with multiple outputs. I have read this topic: How to do backward() for a net with multiple outputs?. But I still don’t understand how to do it. Do you guys have any clear tutorials/examples how to do it?

Thanks,

1 Like

Thanks for your reply. But it is not exactly what I need. My network generates 3 outputs which have different loss functions (MSE, Cross Entropy,…). Or use the same loss function but completely different outputs (e.g. classification from list 1, list 2 and list 3)

Like this one (where Dense = Linear)

doing multiple output + losses should be straight-forward. In your `forward`

function of the model, you have the variables corresponding to each output, and send them through three separate loss functions.

The outputs of the loss functions can be backpropagated all together using `torch.autograd.backward`

2 Likes

Thanks for your hint. I also find this post. I try to compute a loss for each output and append them to a list as the following code,

```
loss_seq = []
for o in output:
cur_loss = criterion(o, target_var)
loss_seq.append(cur_loss)
```

Then I print the losses, which seem quite correct.

Then I tried to do the backpropagation

```
torch.autograd.backward(loss_seq)
```

However, an error occured:

```
File "example/main.py", line 174, in train
torch.autograd.backward(loss_seq)
TypeError: backward() takes at least 2 arguments (1 given)
```

Am I doing wrong? How can I do backpropagation with mutiple loss?

1 Like

you have to give: `torch.autograd.backward(loss_seq, grad_seq)`

where `grad_seq`

is the gradients for each of the losses.

If you want you can give gradients to be all ones:

`grad_seq = [loss_seq[0].new(1).fill_(1) for _ in range(len(loss_seq))]`

Thank you very much. But why can we use all ones as gradients here?

if the loss is a scalar value, then all ones as gradients are what you will use if you call `loss.backward()`

.

2 Likes

Thanks: I understand the ones are the gradients dloss/dloss=1 that are put at the very end of the backpropagation graph.

But creating this grad_seq this way fails with

AttributeError: ‘Variable’ object has no attribute ‘new’

EDIT: OK, it looks like it automatically converts tensors to variables, so this seems to work:

```
grad_seq = [loss_seq[0].data.new(1).fill_(1) for _ in range(len(loss_seq))]
```

Can a model output multiple outputs (as in the above experiments) and only have a single loss function?

1 Like