What is the difference between flex_attention.create_mask and flex_attention.create_block_mask

I want to use flex_attention and need to use custom mask. I find there are three create_mask function in flex_attention.

  • create_mask
  • create_block_mask
  • create_nested_block_mask

I wonder what’s the difference between create_mask and create_block_mask, especially what does block mean?