Clamping leaf tensor without using .data

ElToto · September 22, 2020, 7:07pm

Hello I want to create a noise which if added to normal images create adversarial examples.

I am getting a lot of images from my dataloader and I want to add the noise to each image.
The model is pretrained so it correctly classifies the original images without the noise.
I want to train this noise so that the prediction of image + noise is wrong.

So I create a new Tensor (adv_noise) and an Adam optimizer which optimizes the noise.

But after every step I need to make sure that the maximum and minimum values of
adv_noise are below a given epsilon.

class AE():
  def __init__():
    # create random noise with same size as one image
    # I want to backprop this self.adv_noise so it increases the loss
    init_noise = torch.nn.init.xavier_uniform_(torch.empty(3, 100,100), gain=2)
    self.adv_noise = init_noise.cuda().requires_grad_(True)
    self.optimizer = Adam([self.adv_noise], lr=lr)
    self.eps = 10/255 # max and min values of adv noise arent allowed to be bigger/smaller than epsilon
   """i am also creating the model, criterion, and dataloader here """ 
  
  def train(self):
     self.model.requires_grad_(False)
     for data in dataloader():
       self.optimizer.zero_grad()
       image, label = data
       model_input = image.cuda()+self.adv_noise
       pred = model(model_input)
       loss = self.criterion(pred, label)
       loss.backward()
       self.optimizer.step()
       # up to this point this should be pretty standard
       # But now I need to make certain that self.adv_noise doesnt extend a certain range epsilon
       self.adv_noise.data = self.adv_noise.data.clamp(-self.eps, self.eps)

The last line here works. But I don’t really know why. I’m reading that one shouldn’t use .data anymore
so I’d like to change this last line so it conforms to modern pytorch standards.

How I tried to change the last line:

self.adv_noise.clamp_(-self.eps, self.eps)

Error: a leaf Variable that requires grad is being used in an in-place operation.

2nd try:

self.adv_noise = self.adv_noise.clamp(-self.eps, self.eps)

RuntimeError: Trying to backward through the graph a second time, but the saved intermediate results have already been freed. Specify retain_graph=True when calling backward the first time.

3d try:

self.adv_noise = self.adv_noise.detach().clamp(-self.eps, self.eps)

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Like I said the upper implementation works but I’d like to not use .data

(as a side note)

self.adv_noise.data = self.adv_noise.data.clamp(-eps, eps) # this works

But

self.adv_noise = self.adv_noise.data.clamp(-eps, eps) # this doesn't work

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

dejanbatanjac · September 22, 2020, 7:19pm

t.data is the same as t.detach() so you may use the later.

ElToto · September 22, 2020, 7:21pm

I thought similary but as you can see I tried this in my 3rd. try and this throws runtime exceptions. I think this happens because If i create a tensor with .detach() the gradients are no longer tracked. Although I need this in the next for loop.

dejanbatanjac · September 22, 2020, 7:27pm

I would cite @albanD in here since this is still relevant for PyTorch 4,0 and later.

ElToto · September 22, 2020, 7:35pm

Tried it out:

self.adv_noise = self.adv_noise.detach().clone().clamp(-eps, eps)

But still same error as before

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Also the same error if I change the order:

self.adv_noise = self.adv_noise.clamp(-eps,eps).detach().clone()

albanD · September 22, 2020, 7:40pm

Hi,

You should use with torch.no_grad(): around the code that should not track gradients. That will solve the issue for clamp_

ElToto · September 22, 2020, 7:54pm

Yup this did the trick, thanks

with torch.no_grad():
  self.adv_noise.clamp_(-eps, eps)