Hi Fernando!
What did you actually try? Did it fail with error messages? No gradients?
Zero gradients?
The core problem is that if you want backpropagation, discrete values are
not usefully differentiable, in that their gradients are zero (almost) everywhere.
You have two choices:
You can produce continuous values that tend to be clustered around your
desired discrete values and then figure out how to compute a loss function
from these “approximately-discrete” values (that is usefully differentiable).
You can produce discrete values, but approximate their zero gradients with
non-zero gradients that capture the relationship of your loss function to the
parameters that produce your discrete values in a way that leads to useful
backpropagation and training.
Best.
K. Frank