To generate adversarial examples quickly, I would like to generate them in batches instead of individually for parallelism. This means I need the loss’ dimension to match the batch size. However, the code
logits = net(x) net.zero_grad() prediction = logits.data.max(1) one_hot = Variable(torch.FloatTensor(x.size(0), 1000).zero_().scatter_(1, prediction, 1)) loss = -torch.sum(F.log_softmax(logits) * one_hot, 1) loss.backward()
gives the error
backward should be called only on a scalar (i.e. 1-element tensor) or with gradient w.r.t. the variable
How can I differentiate a loss expressed as a tensor?