It is BertForMaskedLM on page: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling.py#L793 it says:
Outputs:
Outputs the masked language modeling logits of shape [batch_size, sequence_length, vocab_size].
I guess this means that relative probability between numbers a and b should be 10^(a/b).
I am not sure what word logit means.
Thanks