Torchvision model for adversarial attacks

Hi,
I would like to use torchvision pretrained models in adversarial attacks.
These models require some preprocessing (e.g. switch to GBR, subtract mean, divide std, etc.)
The main essence of adversarial attacks is that we add the grad to the input. The modified input can be now either fed back into the model, which would cause it to fail OR visualized to the human user, which won’t notice the difference.
The thing is that the preprocessing actions on the input require reverse postprocessing actions as well which would change the added values (i.e. the input for the model and visualized image won’t be identical).

Is there any way to feed the model the original input instead of the transformed one (so the grad would be in the original RGB format)?
I want to able to compute and add the grad in the right format on the GPU

Please advise
Thanks

You can use any of the existing torchvision models to generate adversarial attacks using gradient step or optimizer based techniques.

I put up an example up a little while back, https://github.com/rwightman/pytorch-nips2017-attack-example

1 Like

The data is normalized there (on the Dataset)
Also, you are using there the Inception model and not VGG\Resnet which require transformation

Who can I avoid it\modify the model so I feed it the RGB image?

All performant pretrained models that I’m aware of require data to be normalized in some way, Inception is a different normalization than Resnet/VGG but they both require it.

If you need the normalization to be part of the model so you can feed it 0…1 float RGB images and backprop through it, etc, you can implement the normalize as a Module and wrap your model with it…

class NormalizeLe(nn.Module):
    """Normalize to -1..1 in Google Inception style
    """
    def __init__(self):
        super(NormalizeLe, self).__init__()

    def forward(self, x):
        return (x - 0.5) * 2.0


class Normalize(nn.Module):
    """Standardize in VGG/ResNet/DenseNet style
    """
    def __init__(self):
        super(Normalize, self).__init__()

        self.register_buffer('mean', torch.FloatTensor([0.485, 0.456, 0.406]).view(-1, 1, 1))
        self.register_buffer('std', torch.FloatTensor([0.229, 0.224, 0.225]).view(-1, 1, 1))

    def forward(self, x):
        return (x - autograd.Variable(self.mean)) / autograd.Variable(self.std)

class NormalizedModel(nn.Module):

    def __init__(self, model, normalizer=Normalize()):
        super(NormalizedModel, self).__init__()
        self.model = model
        self.normalizer = normalizer

    def forward(self, x):
        x_norm = self.normalizer(x)
        return self.model(x_norm)