I’ve a pretrained quantized model which I trained on Colab, I moved the files on my system to run ONNX runtime inference. When loading the model however with
quantized_model = torch.load('quantizedmodel.pt')
My kernel proceeds to die, non-quantized models seem to load just fine.
My torch version is ‘1.11.0’, one thing I’ve done different is that I was mapping the model to CUDA on Colab, however I was mapping to the device instead with map_location = "cpu:0", I tried changing it back to cuda device with cuda:0 instead to no development.
Running nvidia-smi gives me:
Which seems to check out. Looking for some help with debugging this.
@ptrblck I’m not really familiar with debugging using gdb so I pretty much ended up getting stuck there. But the solution seemed to be pretty simple. I’ll add @Zafar here as well so you both know, the problem seemed to be with my conda environment itself rather than anything with torch, there seemed to be issues with meta data generation due to inconsistent numpy packages (this is just a guess but after hours of me debugging my packages that was the only outlier I found). After starting a clean environment, the quantized model seemed to load just fine. Thanks!