I’m a bit at a loss of why I am getting the following error when I run the following model:
class TestModel(torch.nn.Module): def __init__(self): super(TestModel, self).__init__() self.layer1 = torch.nn.Linear(1000, 100) self.relu = torch.nn.ReLU() self.layer2 = torch.nn.Linear(100,1) def forward(self,x): x - self.layer1(x) x = self.relu(x) x = self.layer2(x) return x
I get the error when I run the following line:
model = TestModel() x = torch.randn(5,1000) model(x)
And yet… when I test it through only the nn.linear function, I get no error, such as the following code:
linear = torch.nn.Linear(1000,100) linear(x)
I have been staring at the monitor and scratching my head at this. Other similar errors I found online had to do with people using different shapes input vs target in the MSE loss, which is not the case here.
The error also goes away when I change layer1 to:
self.layer1 = torch.nn.Linear(1000, 1000)
But I don’t want the output of the linear layer to be the same as the input. In fact, the size of the output units should not matter when it comes to the size of the input data (as we should be multiplying a 5x1000 matrix with a 1000x100 one).