I have a batch of tokens and I have padded them into the same length. How could I ignore the loss of pad token in my auto-encoder?
The encoder and decoder look like:
self.encoder = nn.Sequential(
nn.Linear(200, 100),
nn.Tanh(),
nn.Linear(100, 50),
nn.Tanh(),
nn.Linear(50, 25)
)
self.decoder = nn.Sequential(
nn.Linear(25, 50),
nn.Tanh(),
nn.Linear(50, 100),
nn.Sigmoid()
)