I came across the term `in-place operation`

in the

documentation http://pytorch.org/docs/master/notes/autograd.html What does it mean?

Hi,

An in-place operation is an operation that changes directly the content of a given Tensor without making a copy. Inplace operations in pytorch are always postfixed with a `_`

, like `.add_()`

or `.scatter_()`

. Python operations like `+=`

or `*=`

are also inplace operations.

I initially found in-place operations in the following PyTorch tutorial:

### Adding two tensors

```
import torch
>>> x = torch.rand(1)
>>> x
0.2362
[torch.FloatTensor of size 1]
>>> y = torch.rand(1)
>>> y
0.7030
[torch.FloatTensor of size 1]
```

### Normal addition

```
# Addition of two tensors creates a new tensor.
>>> x + y
0.9392
[torch.FloatTensor of size 1]
# The value of x is unchanged.
>>> x
0.2362
[torch.FloatTensor of size 1]
```

### In-place addition

```
# An in-place addition modifies one of the tensors itself, here the value of x.
>>> x.add_(y)
0.9392
[torch.FloatTensor of size 1]
>>> x
0.9392
[torch.FloatTensor of size 1]
```

So in this tutorial, the way the network constructed in the forward method is in-place operation right?

http://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py

I understand that `x.add_(y)`

is an in-place operation.

Is `x = x + y`

in-place and will it cause any problem for autograd?

Hi,

No `x = x + y`

is not inplace. `x += y`

is inplace.

What is the difference? they both will modify x?

Yes true they both modify x. But in-place operation does not allocate new memory for x.

Eg. Normal operation vs In place operation

```
>>> x = torch.rand(1)
>>> y = torch.rand(1)
>>> x
tensor([0.2738])
>>> id(x)
140736259305336
>>> x = x + y # Normal operation
>>> id(x)
140726604827672 # New location
>>> x += y
>>> id(x)
140726604827672 # Existing location used (in-place)
```

Thanks, that make sense

Thanks, this is a good explanation.

Which one is faster? in-place or normal operation?

That depends what you mean by faster

It does the same amount of computations. So that does not change that.

But since there is less memory accesses, this can lead to speed up if your task is bound by memory bandwidth (which is quite often the case on GPU).

Are in-place operations added to the computation graph/tracked by autograd?

This experiment makes me think yes since `a_clone`

has `<MulBackward0>`

as meta-data.

Let me know if this conclusion is correct:

```
def clone_playground():
import torch
a = torch.tensor([1,2,3.], requires_grad=True)
a_clone = a.clone()
print(f'a is a_clone = {a is a_clone}')
print(f'a == a_clone = {a == a_clone}')
print(f'a = {a}')
print(f'a_clone = {a_clone}')
#a_clone.fill_(2)
a_clone.mul_(2)
print(f'a = {a}')
print(f'a_clone = {a_clone}')
a_clone.sum().backward()
```

output:

```
a is a_clone = False
a == a_clone = tensor([True, True, True])
a = tensor([1., 2., 3.], requires_grad=True)
a_clone = tensor([1., 2., 3.], grad_fn=<CloneBackward>)
a = tensor([1., 2., 3.], requires_grad=True)
a_clone = tensor([2., 4., 6.], grad_fn=<MulBackward0>)
```

from here: Clone and detach in v0.4.0

I’ll answer there to keep all the discussion in a single place.