Adding mask to one-hot encoded batch text sequence

I have this code from the GRU with attention PyTorch tutorial

hidden_cat =[0], hidden[1]), dim=2)

    attn_weights = F.softmax(self.attn(, hidden_cat), 2)), dim=2)

    attn_applied = torch.bmm(attn_weights.transpose(0,1),encoder_outputs.transpose(0,1)).transpose(0,1)

    attn_output =, attn_applied), 2)

    attn_output = F.relu(self.attn_combine(attn_output))

atten_weights is 1 x batch size x sequence length. So I have to change every index in the third dimension(seq length) was a pad character weights to negative infinity. What’s a good way to create this mask? The input is 1 x batch size x number of words in bag.