Hi, I have a normal Network, nothing special, and I want to do something depends on the loss.
So I determined a threshold, and the main Idea is (pseudo):
if loss < threshold:
Now I thought why shouldn’t I try to learn this threshold, depend on some loss, maybe the same as the network.
What do you think? It’s not connected to the network graph, so I can’t backpropagate with the rest of the network, should I declare this parameter as an entire model? because when I’m just declaring a tensor it’s not enough…(“AttributeError: ‘Tensor’ object has no attribute ‘train’”)
I’ll be happy to hear your thoughts,
You cannot learn a th as it’s not backpropagable. You can declare it as a buffer but you should also be able to set a simple python float/integer.
There is no advantage in including it inside model parameters if you are not gonna actively modify it.
But I want to modify it, I don’t want to modify the threshold by my own, I want the threshold to be learnt.
Why is it not backpropagable? It’s learnable parameter.
Because backpropagation must flow through data. From you loss, error propagates through all operations you do. It goes though ground-truth, then lets say activation layer and so on. Threshold is connected to nowhere. You are using one loss or another depending on that parameter, but the network does not know that since it does not numerically modify the output/loss in any way.
It’s as if you see a bifurcation in a path, you can choose one branch or the other following an indicator, but you never walk over indicator itself.
The closest thing you can do is trying to add a layer multiplied by the learnable parameter such that it enables or chooses by activating or modifiying data.