Hi,
A few things:
-
Variable
is not needed anymore, you can have simplyimages = data.to('cuda:0')
- You are missing the
optimizer.zero_grad()
before the backward ! You need to manually reset the weights to 0 when you pytorch (see discussion about this here: Why do we need to set the gradients manually to zero in pytorch? ) - You should not use
.data
. If you want to compute things without tracking history, you can either use detach() as_, predicted = torch.max(outputs.detach(), 1)
or wrap the computations inwith torch.no_grad():
to compute predicted and correct. - You’re doing the right thing with
.item()
to accumulate the loss. - For the evaluattion, same thing about .data and Variable
- You might be missing a
model.eval()
before the evaluation loop.