Hello all,
This is a followup question to this one.
I tried to quantize a model of mine using the eager mode post-training quantization. The quantization process seemed to complete just fine as the model stats show significant changes (the model size shrunk from 22 to 5MB and performance-wise, it became 3x faster).
However, when trying to save the model with
torch.jit.save(model, save_path)
I encounter the following error :
/root/anaconda3/envs/shishosama/lib/python3.7/site-packages/torch/quantization/observer.py:121: UserWarning: Please use quant_min and quant_max to specify the range for observers. reduce_range will be deprecated in a future release of PyTorch.
reduce_range will be deprecated in a future release of PyTorch."
Quantized Model: took 310.891 ms [min/max: 310.9/310.9] ms
Size (MB): 5.798671
Traceback (most recent call last):
File "/mnt/internet/hasanpour/embeder_moder_training/simpnet_quantizer.py", line 242, in <module>
torch.jit.save(model, save_path)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/jit/_serialization.py", line 81, in save
m.save(f, _extra_files=_extra_files)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/nn/modules/module.py", line 779, in __getattr__
type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'simpnet_imgnet_drpall_q' object has no attribute 'save'
While saving the model using torch.save
like :
torch.save(model.state_dict(), save_path)
This is how I’m doing the quantization :
def print_size_of_model(model):
torch.save(model.state_dict(), "temp.p")
print('Size (MB):', os.path.getsize("temp.p")/1e6)
os.remove('temp.p')
def calibrate(model, data_loader):
model.eval()
with torch.no_grad():
for image, target in data_loader:
model(image)
train_dataset = ArcDataset(1000)
dtloader = torch.utils.data.DataLoader(train_dataset, batch_size=256, shuffle=True, num_workers=4)
dummy_input = torch.randn(size=(1, 3, 112, 112))
checkpoint_path = 'test_checkpoint.tar'
save_path = 'test_checkpoint_q.tar'
model = simpnet(512, scale=1.0,network_idx=0, mode=1, simpnet_name="simpnet5mq")
checkpoint = torch.load(checkpoint_path, map_location=torch.device('cpu'))
model.load_state_dict(checkpoint['state_dict'], strict=True)
print('\n \n', model)
model.eval()
with Benchmark_Block("Default Model: ") as blk:
for i in range(100):
_ = model(dummy_input)
print_size_of_model(model)
print(f'\n\n-------------------------------\n\n')
model.fuse_model()
print(f'quantized model: {model}')
model.eval()
# Specify quantization configuration
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
# print(model.qconfig)
torch.quantization.prepare(model, inplace=True)
calibrate(model, dtloader)
# Convert to quantized model
torch.quantization.convert(model, inplace=True)
with Benchmark_Block("Quantized Model: ") as blk:
for i in range(100):
_ = model(dummy_input)
print_size_of_model(model)
model = model.cpu()
# also doing a forward pass fails as well
lfw_acc, threshold = lfw_test(model)
torch.save(model.state_dict(), save_path) # saves successfully
torch.jit.save(model, save_path) # fails with the error messgae posted above
works just fine and the model gets saved!
As also stated in the code as a comment, doing a forward pass results in the same error I get when doing this in graph mode that is :
Evaluating data/angles.txt...
0%| | 0/6000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/mnt/internet/hasanpour/embeder_moder_training/simpnet_quantizer.py", line 241, in <module>
lfw_acc, threshold = lfw_test(model)
File "/mnt/internet/hasanpour/embeder_moder_training/lfw_eval.py", line 350, in lfw_test
evaluate(model)
File "/mnt/internet/hasanpour/embeder_moder_training/lfw_eval.py", line 111, in evaluate
output = model(imgs)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/internet/hasanpour/embeder_moder_training/models_new.py", line 769, in forward
out = self.features(x)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/anaconda3/envs/seyyedhossein/lib/python3.7/site-packages/torch/nn/intrinsic/quantized/modules/conv_relu.py", line 70, in forward
input, self._packed_params, self.scale, self.zero_point)
RuntimeError: Could not run 'quantized::conv2d_relu.new' with arguments from the 'QuantizedCUDA' backend. 'quantized::conv2d_relu.new' is only available for these backends: [QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, Tracer, Autocast, Batched, VmapMode].
QuantizedCPU: registered at /pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:858 [kernel]
BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
AutogradOther: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:35 [backend fallback]
AutogradCPU: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:39 [backend fallback]
AutogradCUDA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:43 [backend fallback]
AutogradXLA: fallthrough registered at /pytorch/aten/src/ATen/core/VariableFallbackKernel.cpp:47 [backend fallback]
Tracer: fallthrough registered at /pytorch/torch/csrc/jit/frontend/tracer.cpp:967 [backend fallback]
Autocast: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:254 [backend fallback]
Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
What am I missing here?
Any help is greatly appreciated.