Fairseq RoBERTa Pre-Trained Model Loading Error: "TypeError: expected str, bytes or os.PathLike object, not NoneType"

rohanath · August 18, 2021, 9:28am

Issue:

Model trained using Fairseq, specifically this example and dataset, i.e. RoBERTa Pre-training, results with checkpoints saved in such a way that they cannot be loaded programmatically using model loading code snippet provided at the bottom of their page, or as the following script:


import sys
from fairseq.models.roberta import RobertaModel 

ckpt_path = sys.argv[1]

roberta = RobertaModel.from_pretrained("checkpoints", "checkpoint_best.pt", ckpt_path)

assert isinstance(roberta.model, torch.nn.Module)

Where the path to the saved checkpoint (saved in a protobuf format during training with the commands from the example) is passed as a command line argument.

Error Stack:

Traceback (most recent call last):
  File "test_ckpt.py", line 6, in <module>
    roberta = RobertaModel.from_pretrained("checkpoints", "checkpoint_best.pt", ckpt_path)
  File "fairseq/fairseq/models/roberta/model.py", line 284, in from_pretrained
    **kwargs,
  File "/fairseq/fairseq/hub_utils.py", line 66, in from_pretrained
    path = os.path.join(model_path, file)
  File "/opt/anaconda3/envs/robertapretrain0/lib/python3.7/posixpath.py", line 80, in join
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

The issue seems to be related to how the checkpoints are saved during training.
The structure of the checkpoint_best.pt file is as such:

Checkpoint_best.pt
├── archive
│   ├── data
│   │   ├── 0
│   │   ├── 1
│   │   ├── 10
│   │   ├── 100
│   │   ├── 101
│   │   ├── 102
│   │   ├── 103
│   │   ├── 104
│   │   ├── 105
│   │   ├── 106
│   │   ├── 107
│   │   ├── 108
│   │   ├── 109
│   │   ├── 11
│   │   ├── 110
│   │   ├── 111
│   │   ├── 112
│   │   ├── 113
│   │   ├── 114
│   │   ├── 115
│   │   ├── ...243
│   ├── data.pkl
│   └── version
└── out
2 directories, 247 files

I’d like to understand whether there seems to be an issue in the way the model is being loaded, or whether there’s an issue with the way the checkpoint is being saved.

Environment Details

fairseq Version: 0.10.2
PyTorch Version: 1.9.0+cu102
OS:
NAME=“Ubuntu”
VERSION=“20.04.2 LTS (Focal Fossa)”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu 20.04.2 LTS”
VERSION_ID=“20.04”
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Python version: Python 3.8.8