Can someone help me understand how to use a .pt file?

I am really new to this field and have a few questions. I am trying to quantize a model using MIT Han lab AWQ quantization method. I have successfully pulled the model from huggingface and applied quantization inside a docker container. I now have a file titled as “Llama-2-7b-hf-w4-g128-awq-v2.pt.” I need desperate help on how to use this model from here. I need to benchmark this models performance. Any help would be greatly appreciateed. I do not know where to start so if anyone can help guide me(provide links or resources) I would be eternally grateful.

Thank you in advance!!

I assume your script has created this file? If so, check which object was passed to the torch.save method. I would guess it’s either the model.state_dict(), which will contain all trained parameters and buffers, or a custom dict containing the state_dict as well as other objects and data, e.g. the optimizer.state_dict() etc.
The file itself is just an archive and can be loaded via torch.load. Load it in a new script and check its content.

Just to add to @ptrblck’s comment, the .pt file doesn’t save the model itself (just the weights, as @ptrblck said). So, if you don’t have the original source code for the model you’ll need to re-code the model and then load the weights.