Is it a differentiable operation?

Hi guys,
I have a doubt, can I do this kind of operation without losing the gradient?

w = torch.zeros(1, M)
w[0][idx] = 1   # danger operation?
result = torch.mm(w, t2)

I want backprop on tensor t2, not on w

1 Like

While the assignment (I’d write w[0, idx], personally) will be bad for backpropagation through w in general, there are two things to note here

  • For backpropagation through mm to t2, you just need the value of w, no differentiability of w.
  • In general the pattern w = torch.zeros(...) followed by w[...] = x will allow autograd to record operation enough that you get a gradient in x for the dependency of w on x.
    However, if you assign to the same bits of w twice or otherwise overwrite something that already did require gradients, you’ll be in trouble for inplace operations.

Best regards

Thomas

2 Likes

Thanks Tom, now it’s clear to me