Sure, but in that case (ordinarily apply backpropagation once), loss won’t decrease since the derivative of torch.sign() always returns 0 as a gradient.
So I got this output below:
========train========
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0
loss: 2.0