Attention mask post STFT

I have model which consumes a 1D time series, i.e. a tensor with shape [B,T], and an attention mask, i.e. a boolean tensor with shape [B,T].
The model first performs an STFT on the input time series, yielding a tensor of shape [B,F,N] where F is the FFT size and N is the number of time frames.
I want to calculate the attention mask at the output of the STFT, i.e. another boolean attention tensor with shape [B,F,N].
How do I do this?