Hello everyone,
I have a short question regarding RNN and CrossEntropyLoss:
I want to classify every time step of a sequence. For this I want to use a many-to-many classification with RNN. So I forward my data (batch x seq_len x classes) through my RNN and take every output. My target is already in the form of (batch x seq_len) with the class index as entry.
Now I use the CrossEntropyLoss to train my net and that’s the point I’m not sure about my solution.
The CrossEntropyLoss wants the output as (NxC) and the target as (N). So what I do is reshaping the output to (batchseq_len x C) and the target to (batchseq_len).
My minimal working example looks like this:
import torch
# 3 batches, sequence length 10, 5 features to classify into 3 classes
data = torch.rand((3, 10, 5)) # (batch x seq_len x features)
target = torch.randint(0, 3, (3, 10)) # (batch x seq_len), class index for each timestep
model = torch.nn.RNN(input_size=5, hidden_size=3, batch_first=True)
output, _ = model.forward(data) # (batch x seq_len x number_of_classes)
# reshape output and target for cross entropy loss
output = output.reshape(output.size(0)*output.size(1), -1) # (batch * seq_len x classes)
target = target.reshape(-1) # (batch * seq_len), class index
criterion = torch.nn.CrossEntropyLoss()
loss = criterion(output, target)
Is this correct? Is the reshape() doing what I want to do?
Thanks for your help!