L1Loss returns value on data with different shape but different number of elements

Rafael_Valle · February 1, 2018, 11:51pm

I was surprised to find that the code below doesn’t produce an error.
Is this the expected behavior and how does this play with CUDNN?

import torch
c = torch.nn.L1Loss()
a = torch.ones(10, 500, 80)
b = torch.zeros(10, 80, 500)
c(torch.autograd.Variable(a), torch.autograd.Variable(b))

richard · February 2, 2018, 4:37am

That seems like a bug. Thank you for noticing this, I’ve submitted an issue here: https://github.com/pytorch/pytorch/issues/5007 .

nn.L1Loss doesn’t use CUDNN at all, and it is not supposed to, but I’m not sure why you asked that question.

Rafael_Valle · February 2, 2018, 5:26am

Probably it generalizes to other loss functions, e.g. L2Loss…
I’m mentioned CUDNN because I’m having a CUDA error (below) while calling loss.backward() and thought it was related to the different shapes on the L1Loss.

RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /tmp/pip-hnxetqp4-build/aten/src/THC/generic/THCTensorMath.cu:26

richard · February 2, 2018, 5:39am

Could be possible. If a cuda kernel tries to use those two tensors and assumes that they’re the same size, it might index out of bounds and access illegal memory.