Questions about to() method for object and tensor and how to make code transparent between cpu and gpu


I have some questions about to() methods of PyTorch when using device as the arguments

I have a class abc

class abc(nn.Module):
    def __init__(self):
        super(abc, self).__init__()
        self.linear2 = nn.Linear(10,20)
        self.linear1 = nn.Linear(20,30)
        self.a_parmeters = [self.a]
    def forward(self, inp):
        h = self.linear1(inp)
        return h

Then an object

net = abc()

two devices are available

gdevice = torch.device('cuda')
cdevice = torch.device('cpu')

And I also have two tensors

gx = torch.randn(3,4,device=gdevice)

Now, x is on cpu device and gx is on gpu device

###Q1. If I assign x to CPU device (note that x is already on cpu), e.g. y =, is x and y are the same tensor? I mean, x and y are the same tensor with the same memory, only the name is different (like a reference)? If so, Does it mean we only add another name for x without allocating extra memory?

###Q2. Similar to Q1, if, Does it mean we only add another name for gx without allocating extra memory for gy?

###Q3. One strange thing is, if I send net to gpu by gnet =, net will also be on gpu device

In [98]: net = abc()

In [99]: net.linear1.weight.device
Out[99]: device(type='cpu')

In [100]: gnet =

In [101]: gnet.linear1.weight.device
Out[101]: device(type='cuda', index=1)

In [102]: net.linear1.weight.device
Out[102]: device(type='cuda', index=1)

However, if I use to for tensor, the original tensor will not changed to the new device:

In [107]: x = torch.randn(3,4)

In [108]: x.device
Out[108]: device(type='cpu')

In [109]: gx =

In [110]: gx.device
Out[110]: device(type='cuda', index=1)

In [111]: x.device
Out[111]: device(type='cpu')

Can anyone explain the difference between the tensor and the model object?

###Q4. How to make it transparent for pytorch code on cpu only computer and gpu available device? My idea is to use device variable

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

And use

net = Model().to(device)

where Model() is defined on cpu by default.
What I am not sure is whether this will allocate another copy of memory for net if the Model is defined on CPU and device is also CPU? Or some extra computation if the device is the same?
Is there some better way? I did find the solution on the forum and internet.

Thanks very much!


Q1 + Q2: if you don’t call .clone() or manually make a deepcopy, pytorch tries to use the same storage if possible. So the answer to both questions: usually the same storage will be used.

Q3: for instances of torch.nn.Module the changes are made in the self variable (python uses something similar to call by reference) and you wouldn’t have to reassign it at all. After this operation the self variable is returned to ensure a proper API. Since net and gnet are references to the same internal variable, changing one of them will also change the other one.

If the model is just on the device you want it to push to, the .to() operations becomes a no-OP (just like changing the dtype or calling .cuda() or .cpu() directly.
So yes: usually the method you suggested is the way to go and usually CPU and GPU should cover nearly the same operations (if you don’t use very experimental ones) if you use only torch functions.


Got it! Thanks very much!