Using grid_sample on multiple GPUs

Hi, I have a model with a grid_sample layer, I tried to train my model on multiple GPUs, but got the following error:

RuntimeError: grid_sampler(): expected input and grid to be on same device, but input is on cuda:1 and grid is on cuda:0

Is there anyway to use this layer on multiple GPUs? Thanks

The input and grid should be on the same device.
If you are creating one of these tensors manually in the forward or pass it to the forward method, make sure to transfer it to the same device, e.g. by using:

grid = grid.to(x.device)

Thanks for the reply. I got a segmentation default when moving either of the grid or the input to the same device by input = input.to(grid.get_device()).

My grid is actually the same for all inputs, so I stored it using self.grid = grid and using grid_sample(input, self.grid). Do you think this causes the problem? But I think it’s inefficient to pass the grid every forward.

Try to use grid.device instead.

Might be and you should stick to your work flow, as I was just using it as an example. :wink:

You could also try to register self.grid as a buffer using self.register_buffer, which would move the tensor automatically using model.to().

register_buffer solves my problem. The segmentation default actually comes from other parts. It seems when training on multiple GPUs, we cannot call .cuda() during the forward path, so everything should be registered in the buffer.

Thanks so much for your help!

You could call .cuda() or to(), but should specify the right device to push the tensor to.
E.g. if you would like to create some tensors inside the forward method, you could use the device of some buffers/parameters or the incoming tensor to create the new one.

However, if self.grid is treated as an attribute of the model, registering it as a buffer is the cleaner and better approach. :wink:

Right, that makes so much sense now. Thanks!