As an example, say I have a model which maps a single x of shape(100,) to a single y of some size as
y = model(x)
The loss is
loss = loss_fn(y,target).
Now, I feed the model a batch which is composed of a few vertically stacked x vectors
x_batch = torch.cat((x_1,x_2,x_3),0)
#have not tested this.. trying to indicate a variable with shape (3,100)
y_batch = model(x_batch)
When I calculate the loss,
loss = loss_fn(y_batch,target_batch)
How is it that loss_fn knows to calculate the loss independently between each of the three vectors within the batch, instead of between the higher dimensional variable as a whole?
How does the loss function know when it’s dealing with a batch of outputs and targets, instead of one output and one target?
Struggling to articulate the question… I hope it makes sense!
Have read http://pytorch.org/docs/0.3.0/_modules/torch/nn/modules/loss.html a bit. maybe I’m asking: is there a major difference between reduce=True and reduce = False within a loss fn ?