Loading PyTorch model from TF checkpoint

I am trying to load a pretrained model from the HuggingFace repository (this model), but when I attempt to instantiate the model I get an error referring to loading a PyTorch model from a TensorFlow checkpoint.

There are similar posts (e.g. [1], [2]) from others experiencing this problem in the past but I can’t for the life of me see how they are resolved. Some attempt to match the sentence-transformers and torch package versions to when the models were trained but I have no idea how to find the versions to down/upgrade my packages. Others simply upgrade their torch package to 1.8.0 which I’m using anyway.

Edit:

Please note that I am explicitly trying to load the model from it’s repository files rather than use the recommended way of loading these models which is to simply pass the model name e.g. model = CrossEncoder('ms-marco-TinyBERT-L-6'), which triggers an automatic http download of the model I believe.


Here is my workflow:

Clone the model repository and get the required packages:

$git clone https://huggingface.co/cross-encoder/ms-marco-TinyBERT-L-6
$conda install -c conda-forge sentence-transformers

Load the model in a python script:

### Import packages
from sentence_transformers.cross_encoder import CrossEncoder

### Setup paths
model_path = 'ms-marco-TinyBERT-L-6'

### Instantiate model
model = CrossEncoder(model_path)

The resulting error:

Traceback (most recent call last):
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/transformers/modeling_utils.py”, line 1205, in from_pretrained
state_dict = torch.load(resolved_archive_file, map_location=“cpu”)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/torch/serialization.py”, line 593, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/torch/serialization.py”, line 762, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, ‘v’.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “test.py”, line 8, in
model = CrossEncoder(model_path)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py”, line 47, in init
self.model = AutoModelForSequenceClassification.from_pretrained(model_name, config=self.config)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py”, line 381, in from_pretrained
return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
File “/home/userx/anaconda3/envs/NLP/lib/python3.6/site-packages/transformers/modeling_utils.py”, line 1208, in from_pretrained
f"Unable to load weights from pytorch checkpoint file for ‘{pretrained_model_name_or_path}’ "
OSError: Unable to load weights from pytorch checkpoint file for ‘ms-marco-TinyBERT-L-6’ at 'ms-marco-TinyBERT-L-6/pytorch_model.bin’If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

Any help on this would be greatly appreciated, thank you!

Extra info that may be useful

sentence-transformers=0.4.1
pytorch=1.8.0

This sounds as if you are digging into the internal loading used by HuggingFace, so I think you would get a better answer in their discussion board.

1 Like