How to properly save/load a mixed-precision trained model for CPU-based inference?

vgoklani · July 9, 2020, 2:16am

Hey there,

I would like take advantage of mixed-precision to efficiently train a model, and then use my CPU for inference.

I’ve trained a model using apex (O2), and followed the instructions to save the checkpoint:

checkpoint = {
	'model': model.state_dict(),
	'optimizer': optimizer.state_dict(),
	'amp': amp.state_dict()
}

torch.save(checkpoint, 'checkpoint.pt')

Now I would like to load the trained model for CPU based inference…

I’ve defined a function that sums all the tensors from the state_dict, presumably this sum should be the same for both the saved and loaded model:

def calculate_checksum(model):
	return sum(model.state_dict()[key].sum().item() for key in model.state_dict().keys())

device = torch.device('cpu')
checkpoint = torch.load("checkpoint.pt", map_location=device)

Now, it turns out that the checksum values don’t match unless I initialize amp, which requires both the optimizer to be loaded and the model to be moved to the gpu.

model, optimizer = apex.amp.initialize(model, optimizer, opt_level=“O2”)
apex.amp.load_state_dict(best_checkpoint[‘amp’])

How do i save the trained model so that I can properly load it onto the CPU for inference? I shouldn’t need to initialize amp or load the optimizer. Thanks!

ptrblck · July 10, 2020, 9:27am

Summing the model parameters and the parameters stored in the state_dict might yield a different result, since opt_level='O2' uses FP16 parameters inside the model, while the state_dict will contain upcasted FP32 parameters.

That being said, we recommend to use the native mixed precision implementation by installing the nightly binaries or build from master. The documentation can be found here.