I’m fixing the `torch.manual_seed()`

and I was expecting the two code snippets below to lead to the same results, however calling `backward()`

inside the loop seems to lead to a better performance. I’m struggling to understand whey there is a difference between the two, could this be a numerical problem (the sum of loss values overflows?)

`backward()`

inside the loop:

```
optimizer.zero_grad()
for loss in episode_losses:
weighted_loss = loss * reward
weighted_loss.backward()
optimizer.step()
```

`backward()`

on the sum:

```
optimizer.zero_grad()
total_loss = 0
for loss in episode_losses:
weighted_loss = loss * reward
total_loss += weighted_loss
total_loss.backward()
optimizer.step()
```