How can I choose specific output and calculate loss for the that output in pytorch?

I am using a multilingual BERT model from hugging face, and doing some additional pretraining using MLM. However, it is a bit different. I want to get mean value of sequence of masks and calculate the loss.

For example,

model = AutoModel.from_pretrained('bert-base-multilingual-cased')
input = [MASK][MASK][MASK] is a great guy.
label = [PERSON] is a great guy.

I want to train the model using the loss between mean value of three [MASK] tokens and a [PERSON] token.

What is the best way to do this?