Max on dim 1 of [batch_size,num_steps, dim] data with mask causing Nan

I want to get per sentence vector from word embeddings by max pooling, with seq length(mask).
The code like below, similar strategy works well using tf. and using pytorch using masked softmax also ok.
But below code will cause nan, why ?
class MaxPooling(nn.Module):
def forward(self, x, x_mask):
if x_mask is None or == 0:
return torch.max(x, 1)[0]
x_mask = x_mask.unsqueeze(-1).expand(x.size()), -float(‘inf’))
return torch.max(x, 1)[0]


I can’t really reproduce your issue.
Could you give a full code sample that I can run that reproduce this please?

@albanD OK I will try to make a small code sample. But might be a bit later.
One workaround I found from internet is to use below code , it will not cause Nan.
lengths = (1 - x_mask).sum(1)
return[torch.max(i[:l], dim=0)[0].view(1,-1) for i,l in zip(x, lengths)], dim=0)

Well I have updated to use pytorch 1.0 preview, and Nan problem could not reproduce, so I think might have been solved :slight_smile: