I have a neural network, which produces a single value when excited with input. I need to use this value returned by the network to threshold another array. The result of this threshold operation is used to compute a loss function (the value of threshold is not known before hand and needs to be arrived at by training).
Following is an MWE
x = torch.randn(10, 1) # Say this is the output of the network (10 is my batch size)
data_array = torch.randn(10, 2) # This is the data I need to threshold
ground_truth = torch.randn(10, 2) # This is the ground truth
mse_loss = torch.nn.MSELoss() # Loss function
thresholded_vals = data_array * (data_array >= x) # Returns zero in all places where the value is less than the threshold, the value itself otherwise
# Compute loss and gradients
loss = mse_loss(thresholded_vals, ground_truth)
loss.backward() # Throws error here
Since the operation of thresholding returns a tensor array that is devoid of any gradients the backward() operation throws error.
The thresholding operation is not (usefully) differentiable with respect
to x. To train x you should use a “soft,” differentiable thresholding
operation. You may use sigmoid() as a 'soft," differentiable step