HELP! Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

I use nn.Parameter() to create relative_position_index, and it can’t be moved into gpu, the code is attached, and i can’t figure it out.

Blockquote

def __init__(self, ...):
      self.relative_bias_position_table = nn.Parameter(
      torch.zeros((2*window_size[0]-1)*(2*window_size[1]-1), self.num_heads))
      nn.init.trunc_normal_(self.relative_bias_position_table, std=0.02)
      coords = torch.stack(torch.meshgrid(torch.arange(window_size[0]), torch.arange(window_size[1]))) 
      coords_flatten = torch.flatten(coords, 1)  # [2, h*w]
      relative_coords_bias = coords_flatten[:, :, None] - coords_flatten[:, None, :]
      relative_coords_bias[0, :, :] += window_size[0] - 1
      relative_coords_bias[1, :, :] += window_size[1] - 1
      relative_coords_bias[0, :, :] *= relative_coords_bias[1, :, :].max() + 1
      bias_index = relative_coords_bias.sum(0)
      self.register_buffer("bias_index", bias_index)
      self.bias_embedding = self.relative_bias_position_table[self.bias_index.view(-1)].view(self.window_size[0] * self.window_size[1], self.window_size[0] * self.window_size[1], self.num_heads)
      self.bias_embedding = self.bias_embedding.permute(2, 0, 1).contiguous().unsqueeze(0)

def forward(self,x):
     attn = attn + self.bias_embedding

File "D:\ProgramFile\anaconda\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "D:\yzw\TEST\model.py", line 130, in forward
    attn = attn + self.bias_embedding
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Based on your code snippet it seems you are creating non-leaf tensors by applying differentiable operations on the parameter. Create the tensor first and wrap it into an nn.Parameter afterwards without calling any operations on it afterwards. Once this is fixed model.to() should move this parameter to the device.