Hello, I’m new to PyTorch and I’ve been trying to implement the Fast Gradient Sign Method to test a models’ robustness to adversarial perturbations. However when I try taking the gradient of the loss function with respect to the input I get a zero tensor, so in this case applying FGSM wouldn’t work no matter how big of an epsilon I use.
The model I’m using is this one:
def __init__(self, hidden, beta, insize = 784, outsize = 10):
self.fc1 = nn.Linear(insize, hidden, bias = False)
self.fc2 = nn.Linear(hidden, outsize, bias = False)
self.beta = beta
def forward(self, x):
x = torch.flatten(x, 1)
x = F.relu(self.fc1(x))
x = torch.tanh(self.beta * self.fc2(x))
x = F.log_softmax(x, dim=-1)
And I’ve tried using mse, nll and crossentropy as losses.
I am working with the MNIST dataset and using 2,000 units in the hidden layer.
Any help would be appreciated
I’m unsure how to reproduce the issue, as I’m getting valid gradients in the input:
model = L2Net(10, 1)
x = torch.randn(1, 784, requires_grad=True)
out = model(x)
> tensor([[-9.4862e-04, 3.0799e-04, 6.3704e-04, -1.2979e-05, -5.3003e-04,
9.0936e-06, -7.5076e-04, -7.7078e-04, -5.5273e-04, 8.0073e-04,
4.7446e-04, 9.0180e-04, -4.5098e-04, 1.5162e-04, 7.3597e-04,
Hello. It is only after training the network that I get the problem. I was able to get valid gradients by deleting the tanh application. It seems the problem was that after applying tanh the output would be pure 1s and -1s because of rounding up. So small perturbations of the input did not have any effect on the numeric value of the loss…
Could that be an explanation for the gradient appearing to be zero?
Yes, that could explain the small (or zero) gradients, as the derivative of
tanh in the “saturation” range would be small.