Probably it generalizes to other loss functions, e.g. L2Loss…
I’m mentioned CUDNN because I’m having a CUDA error (below) while calling loss.backward() and thought it was related to the different shapes on the L1Loss.
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /tmp/pip-hnxetqp4-build/aten/src/THC/generic/THCTensorMath.cu:26
Could be possible. If a cuda kernel tries to use those two tensors and assumes that they’re the same size, it might index out of bounds and access illegal memory.