Hello.
I want to get gradients of each sample’s loss w.r.t. each parameters.
Specifically, when
- the number of total data is N
- mini-batch size: B
- parameter (filter) shape:
[C_in, C_out, w, h]
I want to get the gradients for each sample, that is
- gradients shape:
[N, C_in, C_out, w, h]
.
To get the gradients for all samples,
I tried to get the gradients for each mini-batch.
- graidnets shape:
[B, C_in, C_out, w, h]
However, after i use
loss = torch.sum(-onehot*pred, dim=1)
# onehot: [B, # of class]
# pred: [B, # of class]
loss.backward(gradient=torch.ones_like(loss))
I found that the gradients of loss w.r.t. parameters
print(conv1.weight.grad.shape)
have the shape of [C_in, C_out, w, h]
not [B, C_in, C_out, w, h]
.
That is, the gradients of loss for each sample are accumulated.
How can i get the gradient for each sample?
Is the only way to solve using for-loop?
Thank you…
Doyup Lee