Hi,

This is regarding the behavior of `torch.maximum`

and `torch.minimum`

functions.

Here is an example:

Let `a`

be and scalar.

Currently when computing `torch.maximum(x, a)`

, if x > a then the gradient is 1, and if x < a then the gradient is 0. BUT if x = a then the gradient is 0.5.

The same is true for `torch.minimum`

.

Are the mathematical reasons for the 0.5 gradient when x = a? or is it for numerical stability issues?

Thanks,