Need help racking my brain on batch_size

harsha_g · June 28, 2020, 3:27am

I made this up. See if this helps. if not, please feel to ignore or flag it for deletion.

# I assumed many-to-many classification
# num_classes = 50
# input_size = 30495
# hidden_size = 128
# batch_size = 20
# num_layers = 2

# model = LSTM(num_classes, input_size, hidden_size, num_layers, batch_size)

# Input to LSTM = (20, 28, 30495) (batch, seq_len, input_size)
# lstm_output shape = (20, 28, 128) (batch, seq_len, num_directions * hidden_size)
# there's no need to flatten the lstm_output shape
# Let's feed it as is to the Feed Forward layer -->  nn.Linear(hidden_size, num_classes)
# hidden_size is H_in and num_classes is H_out (when you refer to documentation for nn.Linear)
# input to nn.Linear = (20, 28, 128) (N, ∗, H_in)
# output will be = (20, 28, 50) (N, *, H_out)

# Now, try to make sense of this output shaped (20, 28, 50): 
# You have a batch of 20 samples.
# Each sample is a sequence of length 28 (corresponding to each day of the 2 or 4 weeks of whatever)
# For each day you had 30495 features (initially) and now you ended up with 50 features.
# This number 50 corresponds to the number of classes (remember, H_out)
# So, when you feed this to CrossEntropyLoss (for multiclass classification)
# Note you want to calculate 28 losses. One for each time step.
# The input shape to the CELoss for one timestep will be (20, 50) (N, C)
# See documentation for nn.CrossEntropyLoss more info on that.
# How to calculate all the 28 losses? I don't know. My DL is good only that far. :)
# For more on that, I"ll refer you here https://discuss.pytorch.org/t/rnn-for-many-to-many-classification-task/15457/3

To the community at large, feel free to add to my response or correct it if there’s anything wrong. I am happy to learn.