Which torchvision version are you using? mean and std will be created as tensors on the same device as the input as seen in these lines of code and the code works fine:
x = torch.randn(3, 224, 224).cuda()
norm = transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))
out = norm(x)
print(out.device)
> cuda:0