Loading PyTorch model from TF checkpoint

I am trying to load a pretrained model from the HuggingFace repository (this model), but when I attempt to instantiate the model I get an error referring to loading a PyTorch model from a TensorFlow checkpoint.

There are similar posts (e.g. [1], [2]) from others experiencing this problem in the past but I can’t for the life of me see how they are resolved. Some attempt to match the sentence-transformers and torch package versions to when the models were trained but I have no idea how to find the versions to down/upgrade my packages. Others simply upgrade their torch package to 1.8.0 which I’m using anyway.


Please note that I am explicitly trying to load the model from it’s repository files rather than use the recommended way of loading these models which is to simply pass the model name e.g. model = CrossEncoder('ms-marco-TinyBERT-L-6'), which triggers an automatic http download of the model I believe.

Here is my workflow:

Clone the model repository and get the required packages:

$git clone https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-6
$conda install -c conda-forge sentence-transformers

Load the model in a python script:

### Import packages
from sentence_transformers.cross_encoder import CrossEncoder

### Setup paths
model_path = 'ms-marco-TinyBERT-L-6'

### Instantiate model
model = CrossEncoder(model_path)

The resulting error:

Traceback (most recent call last):
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/transformers/modeling_utils.py”, line 1205, in from_pretrained
state_dict = torch.load(resolved_archive_file, map_location=“cpu”)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/torch/serialization.py”, line 593, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/torch/serialization.py”, line 762, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, ‘v’.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “test.py”, line 8, in
model = CrossEncoder(model_path)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py”, line 47, in init
self.model = AutoModelForSequenceClassification.from_pretrained(model_name, config=self.config)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py”, line 381, in from_pretrained
return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/transformers/modeling_utils.py”, line 1208, in from_pretrained
f"Unable to load weights from pytorch checkpoint file for ‘{pretrained_model_name_or_path}’ "
OSError: Unable to load weights from pytorch checkpoint file for ‘ms-marco-TinyBERT-L-6’ at 'ms-marco-TinyBERT-L-6/pytorch_model.bin’If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

Any help on this would be greatly appreciated, thank you!

Extra info that may be useful


This sounds as if you are digging into the internal loading used by HuggingFace, so I think you would get a better answer in their discussion board.

1 Like