Hi PyTorch community,
I am designing a network with a custom layer, in which I am using a third-party framework that requires numpy arrays. Since my layer was not behaving as expected, I tried to find the problem by drastically simplifying it.
At this point I am simply echoing the input to my layer back into the network, all parameters’ requires_grad
are set to False
, all custom gradients return None
. So I would assume that my custom layer should have zero impact on the training, comparing to performance of the network without this layer at all. All I do is to convert a tensor to numpy and back and yet there is a palpable negative impact on training convergence.
A toy model of what I am attempting looks something like this, following the official tutorial
def forward(self, x):
x = torch.tanh(self.fc_1(x))
...
# Disassemble the batch into data points
x_out = []
for _, x_in in enumerate(x):
# --- This should be the output from the custom layer ---
x_numpy = x_in.detach().numpy()
# -------------------------------------------------------
x_out.append(x_in.new(x_numpy).reshape(1, -1))
# Assemble them back into the batch
x = torch.cat(x_out, dim=0)
...
return torch.tanh(self.fc_n(x))
Trying the same weird construction without convertion to numpy has zero influence on training. Is there a way to convert numpy.array
to torch.tensor
in the middle of you network without training penalty?