What is the gradient of torch.unique

Jason_Liu1 · April 26, 2022, 4:00pm

Hi,

I’m wondering what is the gradient of torch.unique. Can I do this or Is there an alternative way to implement something like:
y = model(X)
loss = len(torch.unique(y))
…backprop…

Many thanks indeed!

KFrank · April 26, 2022, 8:48pm

Hi Jason!

torch.unique() doesn’t really have a sensible gradient*. What would
you want it to be?

Oddly, unique() does have a companion backward function, even
though backpropagation through it isn’t implemented:

>>> import torch
>>> torch.__version__
'1.11.0'
>>> t = torch.tensor ([1.0, 2.0, 1.0, 3.0, 1.0, 4.0, 2.0], requires_grad = True)
>>> u = torch.unique (t)
>>> u
tensor([1., 2., 3., 4.], grad_fn=<Unique2Backward0>)
>>> u.sum().backward()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<path_to_pytorch_install>\torch\_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "<path_to_pytorch_install>\torch\autograd\__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
NotImplementedError: the derivative for '_unique2' is not implemented.

Regardless of whether you can make sense of backpropagating through
unique(), len() returns an integer, so it is inherently not differentiable
(and it doesn’t return a pytorch tensor as would be required for autograd
to operate on it).

*) You could arguably define unique() to return a view into the original
tensor, in which case you could backpropagate through it. But this seems
a bit contrived, and in any event, would be ambiguous.

Best.

K. Frank

Jason_Liu1 · April 27, 2022, 6:13pm

Hi Frank,

Thank you very much for your explanation. I’m trying to calculate the entropy of the model’s output (which is a sequence) as loss, but it seems not very practical.

Best regards,
Jason