I’m training a torchvision’s resnet18 network on a gpu on the omniglot dataset. After the training I save the model using the following:
torch.save(model.state_dict(), 'models/%s/model.pth' % model_name)
Then i try to load the model on cpu using:
model.load_state_dict(torch.load('model.pth', map_location=config.device))
When I try to validate the model I get an accuracy of 0.0 on the test dataset, even though the test accuracy during the training process was around 90%. The same happens with a different net trained on MNIST dataset. During the training process, the test accuracy was around 98%, and when the net is saved and loaded again the accuracy decreases to .Even though the number of classes are 10, and if the model was outputting random numbers, it would be correct approximately in 10% of the cases, which is not the case.
This is the code for the validation:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
def validate(val_loader, model, metric_fc):
losses = AverageMeter()
acc1s = AverageMeter()
criterion = CrossEntropyLoss()
# switch to evaluate mode
model.eval()
metric_fc.eval()
with torch.no_grad():
for i, (_input, target) in tqdm(enumerate(test_loader), total=len(test_loader)):
_input = _input.to(device)
target = target.long().to(device)
feature = model(_input)
output = metric_fc(feature, target)
loss = criterion(output, target)
acc1, = accuracy(output, target, topk=(1,))
losses.update(loss.item(), _input.size(0))
acc1s.update(acc1.item(), _input.size(0))
val_log = OrderedDict([
('loss', losses.avg),
('acc1', acc1s.avg),
])
tmp = pd.Series([
epoch,
scheduler.get_lr()[0],
train_log['loss'],
train_log['acc1'],
val_log['loss'],
val_log['acc1'],
], index=['epoch', 'lr', 'loss', 'acc1', 'val_loss', 'val_acc1'])
return log
metric_fc = CosFace(num_features=512, num_classes=10).to(device)
model = MNISTNet(num_features=512).to(device)
model.load_state_dict(torch.load('model.pth'), strict=False)
log = validate(test_loader, model, metric_fc)
print(log)