I have a sequence of operation performed on x to yield y_. A, B, C are learnable parameters of my system.

```
y_ = A(B(C(x)))
```

Is it possible to define a loss such as:

```
L = (y - y_) + sum(dy/dC)
```

And how to do so? Currently, the backward call requires a scalar value. Also, how to do such that I don’t overwrite the actual update gradients dL/dweights while computing dy/dC?

@albanD any ideas?