Is it possible to include computed gradient values into the loss?

I have a sequence of operation performed on x to yield y_. A, B, C are learnable parameters of my system.

y_ = A(B(C(x)))

Is it possible to define a loss such as:

L = (y - y_) + sum(dy/dC)

And how to do so? Currently, the backward call requires a scalar value. Also, how to do such that I don’t overwrite the actual update gradients dL/dweights while computing dy/dC?
@albanD any ideas?