Automatically choose GPU

fsasf1312 · September 24, 2020, 7:50am

Is there any way that PyTorch automatically picks the GPU without putting all created tensors on the GPU with .cuda()/.to(device).
In TensorFlow it chooses the GPU automatically, is there a reason why PyTorch doesn‘t do this?

tom · September 24, 2020, 10:31am

Something like this is discussed in Issue 27878.
The thing is that it introduces a global context which isn’t regarded that faviourably. Also there are concerns about backward compatibility.

If you want to show off your non-conforming hacker nature, you could do

def force_device(device):
    for name in [n for n, f in inspect.getmembers(torch) if inspect.isbuiltin(f) and f.__doc__ is not None and 'device=' in f.__doc__]
        oldfn = getattr(torch, name)
        def factory(*args, **kwargs):
            if kwargs.get('device') is None:
                kwargs['device'] = device
            return oldfn(*args, **kwargs)
        functools.update_wrapper(factory, oldfn)
        setattr(torch, name, factory)

Best regards

Thomas

fsasf1312 · September 25, 2020, 10:42am

Thank you for your answer @tom

Basically I moved my model and the image and label to the chosen device which is set as an input of my main file. Now I have methods in my model where I create intermediate tensors (torch.zeros or torch.ones for example) and want to allocate them on the chosen device of the model. Unfortunately, these tensors are not parameters and moving the model on the chosen device doesn‘t create these tensors automatically on the device which the model is allocated to. So I have to use the device as an input of the model so that the methods can use it as an input. Is there a better way to do this other than having the device object as an input in methods where tensors are created?

tom · September 25, 2020, 12:18pm

So the best practice depends on the situation:

if they’re created once and then kept around and possibly updated, use buffers (m.register_buffer(…)). The optimizers won’t care about them, but they’ll be part of the state dict and loaded and saved etc. They will also be moved/cast along with the model in m.to. This is, for example, where batch norm tracks its parameters.
if you create newly while processing input, it is customary to either use the input or “any parameter” (next(n.parameters())) to get the device (and dtype).
There actually is another way: Tensors have .new_....(...) methods that some people like to use to get the device automatically. I must admit that to my mind this isn’t worth it and it overemphasizes the connection between the existing and the new tensor, so I would recommend to use the device= (and dtype=) keyword arguments. It also seems a bit more in the spirit of “there should be preferably only one way to do it” in the Zen of Python). Note that .new itself is something different and officially deprecated, but .new_... is considered OK by everyone but me.

Best regards

Thomas