Clamping leads to cuda out of memory but works. why?

I have a standard dataloader which loads images.
On top of every image I want to add a static tensor.
But I want to clamp this to (0,1).
This new image is used to train a model.

The following code roughly show the important steps.
(everything is on gpu)

static_tensor = torch.load(path)
for img in dataloader:
  img = img.cuda()
  addition_tensor = img + static_tensor
  clamped_tensor = addition_tensor.clamp(0,1)
  eval = model(clamped_tensor)
  loss = criterion (eval, label)

This creates out of memory errors.

But if I change the clamping into

clamped_tensor =,1)

it no longer creates this error.

What is the reason behind this?

Edit: I noticed that my static tensor has requires_grad = True.

Am I correct to assume that this tensor ‘saves’ the autograd with the model and the graph of this gets bigger in every instance of the for loop?


Your static Tensor should not have requires_grad=True I guess.
And in that case, doing any op here will increase the memory usage as we need to save some values for the backward.

Note that you should never use .data in general!
Here you can use .detach() which will have similar behavior but will avoid weird side effects!