Does torch.cat() method have gradient?

Greetings!
I have three weight maps that’s output from a CONV layer. They are of the same size [B, C, H, W]


My goal is to employ a sparsemax(variant of softmax) normalization on every pixel of the three weight maps so that after normalization their values would sum to one and have a sparse distribution. But my implementation above doesn’t seem to work. So I wonder if torch.cat() method have gradient? Or is it something else that leads to the mistake?

Hi – torch.cat() does pass gradients, as you can verify with the following:

import torch

a = torch.ones(2, requires_grad=True)
b = torch.ones(3, requires_grad=True)
c = torch.cat((a, b))
output = c.sum()
output.backward()
print(a.grad) #tensor([1., 1.])
print(b.grad) #tensor([1., 1., 1.])

Could you be more specific about your problem?