How to use/train Transformer in Pytorch

dav-ell · April 6, 2020, 9:31pm

@paganpasta @zhangguanheng66 Can you share how you handled the difference between training and test? I’m having a similar issue and haven’t figured it out yet.

Specifically, given that I’m training this way:

opt.zero_grad()
# Model requires both "inputs" and "targets"
out = model(inp_emb, tgt_emb, src_mask=src_mask, tgt_mask=tgt_mask, src_key_padding_mask=inp_padding_mask, tgt_key_padding_mask=tgt_padding_mask)
loss = # ...
loss.backward()
opt.step()
sch.step()

How do I call the model for inference? I haven’t seen any code online that uses nn.Transformer with a decoder at inference time.

model.eval()
with torch.no_grad():
  # What goes in "tgt_emb" and masks??
  out = model(inp_emb, tgt_emb, src_mask=src_mask, tgt_mask=tgt_mask, src_key_padding_mask=inp_padding_mask, tgt_key_padding_mask=tgt_padding_mask)
  # ...