# What is `in-place operation`?

I came across the term in-place operation in the
documentation http://pytorch.org/docs/master/notes/autograd.html What does it mean?

16 Likes

Hi,

An in-place operation is an operation that changes directly the content of a given Tensor without making a copy. Inplace operations in pytorch are always postfixed with a _, like .add_() or .scatter_(). Python operations like += or *= are also inplace operations.

38 Likes

I initially found in-place operations in the following PyTorch tutorial:

import torch

>>> x = torch.rand(1)
>>> x

0.2362
[torch.FloatTensor of size 1]

>>> y = torch.rand(1)
>>> y

0.7030
[torch.FloatTensor of size 1]

# Addition of two tensors creates a new tensor.
>>> x + y

0.9392
[torch.FloatTensor of size 1]

# The value of x is unchanged.
>>> x

0.2362
[torch.FloatTensor of size 1]

# An in-place addition modifies one of the tensors itself, here the value of x.

0.9392
[torch.FloatTensor of size 1]

>>> x

0.9392
[torch.FloatTensor of size 1]
14 Likes

So in this tutorial, the way the network constructed in the forward method is in-place operation right?
http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py

I understand that x.add_(y) is an in-place operation.
Is x = x + y in-place and will it cause any problem for autograd?

1 Like

Hi,

No x = x + y is not inplace. x += y is inplace.

11 Likes

What is the difference? they both will modify x?

3 Likes

Yes true they both modify x. But in-place operation does not allocate new memory for x.

Eg. Normal operation vs In place operation

>>> x = torch.rand(1)
>>> y = torch.rand(1)
>>> x
tensor([0.2738])
>>> id(x)
140736259305336
>>> x = x + y   # Normal operation
>>> id(x)
140726604827672 # New location
>>> x += y
>>> id(x)
140726604827672 # Existing location used (in-place)
28 Likes

Thanks, that make sense

Thanks, this is a good explanation.

Which one is faster? in-place or normal operation?

That depends what you mean by faster
It does the same amount of computations. So that does not change that.
But since there is less memory accesses, this can lead to speed up if your task is bound by memory bandwidth (which is quite often the case on GPU).

8 Likes

This experiment makes me think yes since a_clone has <MulBackward0> as meta-data.

Let me know if this conclusion is correct:

def clone_playground():
import torch

a_clone = a.clone()
print(f'a is a_clone = {a is a_clone}')
print(f'a == a_clone = {a == a_clone}')
print(f'a = {a}')
print(f'a_clone = {a_clone}')
#a_clone.fill_(2)
a_clone.mul_(2)
print(f'a = {a}')
print(f'a_clone = {a_clone}')
a_clone.sum().backward()

output:

a is a_clone = False
a == a_clone = tensor([True, True, True])
a = tensor([1., 2., 3.], requires_grad=True)
a_clone = tensor([1., 2., 3.], grad_fn=<CloneBackward>)
a = tensor([1., 2., 3.], requires_grad=True)
a_clone = tensor([2., 4., 6.], grad_fn=<MulBackward0>)

from here: Clone and detach in v0.4.0

I’ll answer there to keep all the discussion in a single place.

1 Like

Best practice: Avoid inplace operations if it is not necessary as it changes the state of tensors silently. Non-inplace operations will make a copy before doing the operation. Thus, if an operation is inplace within a function, it affects the tensor’s state outside of the function while the non-inplace operation does not change the state unless you reassign it outside of the function.

e.g.

def inplace_op(X):
X += 1
return X

X = torch.rand(4, 2)
inplace_op(X) # X is changed without re-asigned to X

Summary of inplace operations:

• x *= 3
• X[…] = …

Common examples of inplace operations:

• x += 1, x *= 3, …
• x[2] = 2, X[0, 0] = 2
• x[:, 3] = 3
• X[2] /= 5
• X[:, 3] /= 4
• X[:, 3] = X[:, 3] / 4

These are not inplace operations:

• x = x + 1
• y = x.clone; y[0] += 100
4 Likes

Your examples saved my days of debugging. Thnx

1 Like