First of all, my apology if my question seems trivial and my English is not good enough.
As given in the name classification tutorial here https://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial.html
My first question is, when calling the loss function criterion(output, category_tensor) in
def train(category_tensor, line_tensor): hidden = rnn.initHidden() rnn.zero_grad() for i in range(line_tensor.size()): output, hidden = rnn(line_tensor[i], hidden) loss = criterion(output, category_tensor) loss.backward() # Add parameters' gradients to their values, multiplied by learning rate for p in rnn.parameters(): p.data.add_(-learning_rate, p.grad.data) return output, loss.item()
as I checked, the dimension of output is 18x1 (because there are 18 classes) and category_tensor is only a single valued tensor containing the label of the class in integer.
Is this the only valid parameter for the call, or can I pass the predicted vector for category_tensor? I couldn’t really understand the documentation that I found here (https://pytorch.org/docs/stable/nn.html#nllloss). Possibly I understand it wrong, I tried this modification and doesn’t seem to work.
def randomTrainingExample(): category = randomChoice(all_categories) line = randomChoice(category_lines[category]) # attempting to use one hot encoded value like SoftMax, but with the log value category_tensor = nn.LogSoftmax()(torch.tensor((np.array(all_categories) == category).astype(np.int), dtype=torch.float)) line_tensor = lineToTensor(line) return category, line, category_tensor, line_tensor
Secondly, in the section “turning names into tensor”, is the batch dimension in the second dimension?
To make a word we join a bunch of those into a 2D matrix
<line_length x 1 x n_letters>.
That extra 1 dimension is because PyTorch assumes everything is in batches - we’re just using a batch size of 1 here.
If I’m not mistaken, for image data, the batch dimension is in the first dimension, and I think this is more intuitive:
size_of_training_set = batch_size x number_of_channel x image_width x image_height
Can we define the batch size in the first dimension?
Thank you very much.