How do you run transformers example scripts (e.g. run_wwm_mlm.py) with a saved model?)

jungminc88 · August 23, 2021, 4:58pm

I have trained BertForSequenceClassifications using one of the Huggingface-provided pretrained models (cl-tohoku/bert-base-japanese-whole-word-masking)
and now I want to train it on MLM task using run_mlm_wwm.py.

I have saved the model after the classification task:
torch.save(model.state_dict(), my_model_name)

Now when I try to run

python run_mlm_wwm.py \
    --model_name_or_path my_model_name \
    --train_file my_data.txt \
    --do_train \
    --output_dir /output_dir

it returns this error message

Traceback (most recent call last):
File “run_mlm_wwm.py”, line 408, in
main()
File “run_mlm_wwm.py”, line 274, in main
config = AutoConfig.from_pretrained(model_args.model_name_or_path, **config_kwargs)
File “/home/cl/jungmin-c/.pyenv/versions/anaconda3-5.1.0/envs/jp/lib/python3.7/site-packages/transformers/models/auto/configuration_auto.py”, line 446, in from_pretrained
config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File “/home/cl/jungmin-c/.pyenv/versions/anaconda3-5.1.0/envs/jp/lib/python3.7/site-packages/transformers/configuration_utils.py”, line 495, in get_config_dict
config_dict = cls._dict_from_json_file(resolved_config_file)
File “/home/cl/jungmin-c/.pyenv/versions/anaconda3-5.1.0/envs/jp/lib/python3.7/site-packages/transformers/configuration_utils.py”, line 578, in _dict_from_json_file
text = reader.read()
File “/home/cl/jungmin-c/.pyenv/versions/anaconda3-5.1.0/envs/jp/lib/python3.7/codecs.py”, line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0x80 in position 64: invalid start byte

After some searching, I gathered that I have to somehow convert my_model_name to a form that is acceptable to run_mlm_wwm.py but I could not figure out how.

Can anyone suggest how you do this? I would greatly appreciate your help. Thank you!

ptrblck · August 24, 2021, 3:56am

Based on this post it seems you are trying to encode/decode an invalid character. What seems to be bit weird is that apparently a JSON file is expected (_dict_from_json_file is used), so my guess would be that the decoder tries to decode a wrong file format.