Possible bug? Linear layer does not throw error on incorrect input size when on GPU

My network threw an error during backprop because layer shapes did not match, and I wondered how/why the network was able to do the forward pass without throwing an error. After investigating, I’ve discovered some weird behavior.

If the model is on the CPU, all is well:

model = nn.Linear(100,1)
x = torch.randn(1,200)
model(x)

throws RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x200 and 100x1), as expected.

However, if I put it on the GPU:

device = 'cuda:0'
model.to(device)
model(x.to(device))

it happily computes without throwing an error.

Is this expected behavior?

Hi Ada!

This appears to be a known bug that has recently been fixed. See
this thread:

Best.

K. Frank

Thanks for linking that post! I had not succeeded in choosing the right keywords to find it!