I’m building a multiclass classification model using a GRU. I’m struggling to get my head around how to shape the output such that the loss can be calculated.
The output shape in my batch [342, 51] whereas the shape of the label is [32, 51] which means the loss can’t be calculated. Any pointers would be very helpful - thank you.
The LSTM layer takes the tensor of shape (seq_len, batch, features), so to comply with this, you have to call to the lstm with “self.lstm(embed_out.transpose(0,1))”, unless you inp is in the shape of (seq_len, batch) or you have defined the lstm class with “batch_first=True”. I don’t know why you are working on the hidden1, but usually you take the “out = self.sigmoid(self.linear(out)).reshape(-1, class_num)”, so your out is the shape of (seq_len*batch, class_num).
Thanks this is helpful. I still can’t align the final shape however. The target batch size is 32 with 51 classes. I can’t figure out how to get the output of the model to then match with this matrix. I’ve made the your suggestions I’m getting a [9952, 51] output - are there some additional steps I’m missing in terms of getting the model output to match the target shape?
Well you target shape isn’t going to be (32,51), unless the length of the sequences is 1. Perhaps you just want “out = self.sigmoid(self.linear(out[-1,:,:])).reshape(-1, class_num)”, which will provide just the last element of each sequence, thus being of a shape (batch, class_num).