Transformer attention ignores boolean mask

In the code for calculating attention, if the mask is boolean, it is not used:
https://pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html

The relevant lines are:
if attn_mask.dtype == torch.bool:
attn_mask.masked_fill_(attn_mask.logical_not(), float(“-inf”))
else:
attn_bias += attn_mask
attn_weight = query @ key.transpose(-2, -1) * scale_factor
attn_weight += attn_bias