What is proxy loss

What is the meaning of the ‘proxy loss’?

Where did you encounter this term and in which context?
In the terms of ML/DL I would claim a proxy loss is a loss, which might be used instead of the “real” loss, as it’s either simpler to optimizer or implementable.
E.g. instead of using an absolute loss function, you could use nn.MSELoss as a proxy, as it might be easier to train with it.

Let me know, if you’ve seen the term in another context.

This is the explanation that I am looking for, it was somehow confused when I encountered for the first time in several papers. Now it makes sense, thanks a lot:)