Tensor.clone.detach() vs tensor.detach()?

Ge0rges · April 27, 2020, 5:47pm

What’s the difference between Tensor.clone.detach() and tensor.detach()? Since detach returns the a detached version of tensor, what is the point of cloning?

russellizadi · April 27, 2020, 8:05pm

When the clone method is used, torch allocates a new memory to the returned variable but using the detach method, the same memory address is used.

Compare the following code:

import torch
device = torch.device("cuda")
a = torch.randn([10000, 10000])
a = a.to(device)
print(round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
b = a.detach()
print(round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
c = a.clone().detach()
print(round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')

0.4 GB
0.4 GB
0.7 GB

to this code:

import torch
device = torch.device("cuda")
a = torch.randn([10000, 10000])
a = a.to(device)
print(round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
b = a.clone().detach()
print(round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
c = a.detach()
print(round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')

0.4 GB
0.7 GB
0.7 GB

Ge0rges · April 28, 2020, 7:36pm

Does that mean that any modifications made to the detached tensor also occur to the attached version?

russellizadi · April 29, 2020, 8:28pm

That’s true. Following the same example:

import torch
device = torch.device("cuda")
a = torch.randn([2])
a = a.to(device)
print(a)
b = a.detach()
print(b)
c = a.clone().detach()
print(c)
b[0] = 1.
print(a)
print(c)

tensor([ 0.2042, -1.8436], device='cuda:0')
tensor([ 0.2042, -1.8436], device='cuda:0')
tensor([ 0.2042, -1.8436], device='cuda:0')
tensor([ 1.0000, -1.8436], device='cuda:0')
tensor([ 0.2042, -1.8436], device='cuda:0')

kwea123 · February 2, 2021, 3:26pm

Apart from the difference in memory, do they share the same properties such as the value, requires_grad=False, and all other properties?