What is the gradient of torch.unique


I’m wondering what is the gradient of torch.unique. Can I do this or Is there an alternative way to implement something like:
y = model(X)
loss = len(torch.unique(y))

Many thanks indeed!

Hi Jason!

torch.unique() doesn’t really have a sensible gradient*. What would
you want it to be?

Oddly, unique() does have a companion backward function, even
though backpropagation through it isn’t implemented:

>>> import torch
>>> torch.__version__
>>> t = torch.tensor ([1.0, 2.0, 1.0, 3.0, 1.0, 4.0, 2.0], requires_grad = True)
>>> u = torch.unique (t)
>>> u
tensor([1., 2., 3., 4.], grad_fn=<Unique2Backward0>)
>>> u.sum().backward()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<path_to_pytorch_install>\torch\_tensor.py", line 363, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "<path_to_pytorch_install>\torch\autograd\__init__.py", line 173, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
NotImplementedError: the derivative for '_unique2' is not implemented.

Regardless of whether you can make sense of backpropagating through
unique(), len() returns an integer, so it is inherently not differentiable
(and it doesn’t return a pytorch tensor as would be required for autograd
to operate on it).

*) You could arguably define unique() to return a view into the original
tensor, in which case you could backpropagate through it. But this seems
a bit contrived, and in any event, would be ambiguous.


K. Frank

Hi Frank,

Thank you very much for your explanation. I’m trying to calculate the entropy of the model’s output (which is a sequence) as loss, but it seems not very practical.

Best regards,