Why model.to(device) wouldn't put tensors on a custom layer to the same device?

Ok this is not a good example because I could just move it to the same tensor as x.

But the problem I face is because I have to define the weights myself

self.w = Parameter(torch.zeros(out_features, in_features))

in __init__ function.

At that point, I still do not know the input device. Therefore my self.w is on the cpu but my input is on the gpu.