I tried using torch.clamp, but also seems non-differentiable:
import torch
from torch import autograd
threshold = autograd.Variable(torch.rand(1), requires_grad=True)
print('threshold', threshold)
# m = torch.nn.Threshold(threshold, threshold)
input = autograd.Variable(torch.rand(1, 5), requires_grad=True) - 0.5
print('input', input)
# out = m(input)
out = torch.clamp(input, min=threshold)
print('out', out)
out.backward(torch.ones(1, 5))
print('threshold.grad.data', threshold.grad.data)
> Traceback (most recent call last):
> File "4729.py", line 11, in <module>
> out = torch.clamp(input, min=threshold)
> File "/Users/hugh2/conda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/variable.py", line 396, in clamp
> return CmaxConstant(min)(self)
> File "/Users/hugh2/conda3/envs/pytorch/lib/python3.6/site-packages/torch/autograd/_functions/pointwise.py", line 232, in forward
> self._max_buffer = i.gt(self.constant).type_as(i)
> TypeError: gt received an invalid combination of arguments - got (Variable), but expected one of:
> * (float value)
> didn't match because some of the arguments have invalid types: (Variable)
> * (torch.FloatTensor other)
> didn't match because some of the arguments have invalid types: (Variable)
I tried on tensorflow, and seemed to work ok:
import tensorflow as tf
graph = tf.Graph()
with graph.as_default():
input_t = tf.placeholder(tf.float32, [None], 'input')
threshold_t = tf.Variable(0.05)
out_t = tf.minimum(input_t, threshold_t)
sess = tf.Session()
with sess.as_default():
sess.run(tf.global_variables_initializer())
print('out', sess.run(out_t, feed_dict={input_t: [-0.3, 0.0, 0.7]}))
# get grad of out_t wrt threshold_t
grad_out_t = tf.gradients(out_t, [threshold_t])[0]
print('d(out)/d(theshold)', sess.run(grad_out_t, feed_dict={input_t: [-0.3, 0.0, 0.7]}))
print('d(out)/d(theshold)', sess.run(grad_out_t, feed_dict={input_t: [-0.3, 0.0, -0.7]}))
print('d(out)/d(theshold)', sess.run(grad_out_t, feed_dict={input_t: [-0.3, 0.5, 0.7]}))
out [-0.30000001 0. 0.05 ]
d(out)/d(theshold) 1.0
d(out)/d(theshold) 0.0
d(out)/d(theshold) 2.0
Edit: I guess maybe this needs the Scalar thing that’s on the way, in order to be solved?