Using Bert on cpu: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu

I am trying to use my fine-tuned BERT for visualization on my local machine. The model parameters are saved in a file called trained_model.pt. When I try to load and use it, I get the following error:

import torch
from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM

# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenized input
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = tokenizer.tokenize(text)

# Mask a token that we will try to predict back with `BertForMaskedLM`
masked_index = 8
tokenized_text[masked_index] = '[MASK]'
assert tokenized_text == ['[CLS]', 'who', 'was', 'jim', 'henson', '?', '[SEP]', 'jim', '[MASK]', 'was', 'a', 'puppet', '##eer', '[SEP]']

# Convert token to vocabulary indices
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)
# Define sentence A and B indices associated to 1st and 2nd sentences (see paper)
segments_ids = [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1]

# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])

# Load pre-trained model (weights)
model = torch.load('trained_model.pt', map_location=torch.device('cpu'))
model.eval()

# Predict hidden states features for each layer
with torch.no_grad():
    encoded_layers, _ = model(tokens_tensor, segments_tensors, output_all_encoded_layers=False)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-32-65859554cc9e> in <module>
      5 # Predict hidden states features for each layer
      6 with torch.no_grad():
----> 7     encoded_layers, _ = model(tokens_tensor, segments_tensors, output_all_encoded_layers=False)

/opt/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

/opt/anaconda3/envs/bert/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py in forward(self, *inputs, **kwargs)
    151         for t in chain(self.module.parameters(), self.module.buffers()):
    152             if t.device != self.src_device_obj:
--> 153                 raise RuntimeError("module must have its parameters and buffers "
    154                                    "on device {} (device_ids[0]) but found one of "
    155                                    "them on device: {}".format(self.src_device_obj, t.device))

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cpu

Can anyone explain me, how this is resolved? …by the way I get problems with DataParallel all the time. Not sure if the error is caused by that, but this feature seems really buggy to me.