Good catch! Indeed, replacing the tensor with constants won’t work. The threshold is not usefully differentiable since its gradient would be zeros everywhere and undefined or Inf at the rounding points. This post might be useful.
1 Like