Hello !
I’m currently developing a custom layer and having trouble getting it to work, specially when I’m using the results of a previous layer in my network as input of my custom layer.
Here’s a code snippet isolating the problem I’m currently having :
def forward(self, y):
a = torch.unsqueeze(y, 0)
b = a.expand(self.output_resolution[0] * self.output_resolution[1], -1, -1)
c = torch.permute(b, (1, 0, 2))
d = torch.sum(c, dim=-1)
end = torch.reshape(d, (-1, self.output_resolution[0], self.output_resolution[1]))
output = end[:, None, :, :]
out = output.expand(-1, self.in_c, -1, -1).float()
return out
It’s not what my custom layer will do but it isolates the last error that I can’t resolve.
y is the output of a previous Linear Layer (with shape [batch_size, num_features])
output_resolution and in_c are just some dimensions I want as output to my custom layer
The problem is, from the moment I use y, I get the RunTime error : one of the variables needed for gradient computation has been modified by an inplace operation
I try everything, copy pasting every value of the y tensor manually, use y.clone(), torch.clone(y), even torch.clone(y.clone()), nothing works, I get the RunTime error even though I only do unsqueeze and expand operation on y.
On my full Custom Layer, the only way I found to make it run, is by using y.detach(), but I’m afraid that, a part of my custom layer that is directly using value of y will not be considered in the backward and gradient computation operation.
I have two questions :
- Is there a problem using y.detach() in my final Custom Layer, specially in the gradient computation ?
- Is there a solution to my problem, is this a problem with some upper code (I don’t know, maybe by using DistributedDataParallel or other) ?
Thank you in advance and have a nice day !
TV