Does the performance of loading a model here seem correct?
The initial called to load a 26MB model on line 86 is ~100 times slower than loading the 26MB checkpoint on 87 and 5.45 seconds is a lot longer than I would expect?
Total time: 5.4545 s
Function: load_model at line 81
Line # Hits Time Per Hit % Time Line Contents
==============================================================
81 @profile
82 def load_model(dirname, device):
82 """ load model from disk """
83 1 90.0 90.0 0.0 device = torch.device(device)
84 1 26.0 26.0 0.0 modelfile = os.path.join(dirname, 'model.py')
85 1 7.0 7.0 0.0 weights = os.path.join(dirname, 'weights_%s.tar' % weights)
86 1 5379756.0 5379756.0 98.6 model = torch.load(modelfile, map_location=device)
87 1 72966.0 72966.0 1.3 model.load_state_dict(torch.load(weights, map_location=device))
88 1 1648.0 1648.0 0.0 model.eval()
89 1 1.0 1.0 0.0 return model
I’m using pytorch 1.2 and a device="cuda"
and have confirmed the same performance with 1.3.1. Adding in calls to sync shows that all of the time is actually spend in torch.load
.
86 1 5395545.0 5395545.0 98.6 model = torch.load(modelfile, map_location=device)
87 1 76.0 76.0 0.0 torch.cuda.synchronize(device=device)
88 1 72403.0 72403.0 1.3 model.load_state_dict(torch.load(weights, map_location=device))
89 1 52.0 52.0 0.0 torch.cuda.synchronize(device=device)
90 1 1640.0 1640.0 0.0 model.eval()
91 1 21.0 21.0 0.0 torch.cuda.synchronize(device=device)
Am I doing something wrong here or is this typical?