Adding mask to one-hot encoded batch text sequence

I have this code from the GRU with attention PyTorch tutorial

hidden_cat = torch.cat((hidden[0], hidden[1]), dim=2)

    attn_weights = F.softmax(self.attn(torch.cat((input, hidden_cat), 2)), dim=2)

    attn_applied = torch.bmm(attn_weights.transpose(0,1),encoder_outputs.transpose(0,1)).transpose(0,1)

    attn_output = torch.cat((input, attn_applied), 2)

    attn_output = F.relu(self.attn_combine(attn_output))

atten_weights is 1 x batch size x sequence length. So I have to change every index in the third dimension(seq length) was a pad character weights to negative infinity. What’s a good way to create this mask? The input is 1 x batch size x number of words in bag.