Parameters cannot update if I apply torch.cat in forward

LearnerYme · October 4, 2020, 2:24pm

In my network structure, I’d like to attach an additional neuron to the input of a fully connected layer, i.e. the output of its former layer has 128 neurons, and I want to let the model take another feature ( the number of rows of the input) into consideration, so I apply torch.cat((x, n_tensor), 1) in network’s forward method, where x is the output of former layer ( a [N, 128] tensor, N is the batch size) and n_tensor indicates the rows of input of this batch respectively ( a [N, 1] tensor). And I found that if I do so, the loss of whole training set and whole validation set will not change, in addition, I checked the weights of layers, and they did not update. I guess that’s because torch.cat() would break the gradient, and I’m confused what should I do to append this n_tensor to the network.
Thanks a lot!

ptrblck · October 5, 2020, 6:30am

That’s not the case as seen in this minimal code snippet:

x1 = torch.randn(2, 3, requires_grad=True)
x2 = torch.randn(2, 1, requires_grad=True)

y = torch.cat((x1, x2), dim=1)
y.mean().backward()

print(x1.grad)
print(x2.grad)

To verify it, you could also check the .grad attributes of your model and check, if another operation breaks the computation graph.

LearnerYme · October 5, 2020, 10:14am

Thanks a lot! I’ll check my network carefully then.