"Expected all tensors to be on the same device " in torch.nn.quantizable.MultiheadAttention

I’m using torch.nn.quantizable.MultiheadAttention in my model that trains on GPU. This torch layer has the following code:

k_zeros = torch.zeros((k.size(0), 1) + k.size()[2:])
...
k = torch.cat([k, k_zeros], dim=1)

which throws “RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!”, as the tensor k comes from the forward input, and in on GPU.
Is this a bug, and device=k.device should be added in the definition of k_zeros?