I want to use flex_attention and need to use custom mask. I find there are three create_mask function in flex_attention.
- create_mask
- create_block_mask
- create_nested_block_mask
I wonder what’s the difference between create_mask
and create_block_mask
, especially what does block mean?