Hi,

I’m wondering what is the gradient of torch.unique. Can I do this or Is there an alternative way to implement something like:

y = model(X)

loss = len(torch.unique(y))

…backprop…

Many thanks indeed!

Hi,

I’m wondering what is the gradient of torch.unique. Can I do this or Is there an alternative way to implement something like:

y = model(X)

loss = len(torch.unique(y))

…backprop…

Many thanks indeed!

Hi Jason!

`torch.unique()`

doesn’t really have a sensible gradient*. What would

you want it to be?

Oddly, `unique()`

does have a companion backward function, even

though backpropagation through it isn’t implemented:

```
>>> import torch
>>> torch.__version__
'1.11.0'
>>> t = torch.tensor ([1.0, 2.0, 1.0, 3.0, 1.0, 4.0, 2.0], requires_grad = True)
>>> u = torch.unique (t)
>>> u
tensor([1., 2., 3., 4.], grad_fn=<Unique2Backward0>)
>>> u.sum().backward()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<path_to_pytorch_install>\torch\_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "<path_to_pytorch_install>\torch\autograd\__init__.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
NotImplementedError: the derivative for '_unique2' is not implemented.
```

Regardless of whether you can make sense of backpropagating through

`unique()`

, `len()`

returns an integer, so it is inherently not differentiable

(and it doesn’t return a pytorch tensor as would be required for autograd

to operate on it).

*) You could arguably define `unique()`

to return a view into the original

tensor, in which case you could backpropagate through it. But this seems

a bit contrived, and in any event, would be ambiguous.

Best.

K. Frank

Hi Frank,

Thank you very much for your explanation. I’m trying to calculate the entropy of the model’s output (which is a sequence) as loss, but it seems not very practical.

Best regards,

Jason