No, this won’t work. The problem is that this version of myloss() isn’t usefully differentiable. It is constant almost everywhere, so the gradient
will always be zero.
Mathematically, myloss() is differentiable (with zero gradient) except
when any of the data[i] = 0.5, at which values myloss() jumps
discontinuously and the derivative is not defined.
Numerically with pytorch you will always get zero gradient, even when
some data[i] = 0.5, because whatever branch of the conditional
you go through, a constant function (constant for that branch) is being
myloss() and backpropagation will “work” in the sense that calling loss.backward() will give you a well-defined gradient, but it doesn’t
actually do you any good because the gradient is always zero.
In practical terms, let’s say that data = 0.5001, and you get some
value of the loss function. Let’s also say that at data = 0.5 the
loss function jumps to a lower, more favorable value so that you would
like your optimizer step to update data to 0.4999. The problem is
that the optimizer only knows about the gradient, which is zero, and
doesn’t know that very nearby at 0.4999 you get a lower loss. With zero
gradient the optimizer doesn’t (and can’t) know in which direction to vary data, that is, whether to increase, decrease, or leave unchanged data, to get to a lower loss.
This is how gradient-descent optimization methods (which are the core
of pytorch’s backpropagation) work, and it’s an inherent limitation they