AdaBoost reduces the weight of training samples that are properly classified for each iteration while maintaining the weight of all incorrectly classified data so that the training samples that have been classified correctly have less influence than the not correctly classified training samples. Thus, the model in the next iteration is more adapted to the previously misclassified samples.
Would you please tell me what to do once I have the weights for each data set? Say I have 5 data with the following weights: [0.2, 0.1, 0.2, 0.4, 0.1]. Do I need to change the loss function?
Thank you in advance!
I’m not sure which
AdaBoost implementation you are using but I assume you are manually calculating the weights and would then like to apply them to all samples.
In this case, you could calculate the unreduced loss via setting
reduction='none' to instantiate the loss function, apply the corresponding sample weights to each sample (make sure you are not losing the correspondence via shuffling), and reduce the loss before calculating the gradients in the backward pass.
Hi @ptrblck, thank you for your response. Could you give me a piece of code on how to implement it? What confuses me is that in the documentation, weight should be applied to each class rather than to each data. Thank you once again!
I was thinking about multiplying the loss with the weights as seen here:
output = torch.randn(10, 10, requires_grad=True)
target = torch.randint(0, 10, (10,))
criterion = nn.CrossEntropyLoss(reduction='none')
loss = criterion(output, target)
weight = torch.rand(10) # grab your weights for the current samples
loss = loss * weight
Thanks! so based on your example, I assume the total number of data is 10, right? What if the total amount of data exceeds the batch size?
Yes, 10 is the batch size in my example.
You would need to map the weights to each sample as described before.
I don’t know how you are creating these weights, how your dataset looks like etc., but you could use e.g. the indices for each sample returned by a custom
Dataset to index the weights for the current sample, or you could also create as new custom
Dataset after creating the weights, which would return the data, target, and weights etc.
Actually, I’d like to implement CNN’s incremental learning system based on Adaboost in medical images. Thus, all of the weights are based on the Adaboost technique.
I have already picked weights based on the number of batch sizes for each training loop. However, I discovered the following error when (loss = loss * weight) at the end of the loop.
RuntimeError: The size of tensor a (32) must match the size of tensor b (15) at non-singleton dimension 0
It seems the dimension of the weight (15) is not the same as loss (32).
Check the shapes of the
weight tensor and make sure you can multiply both. Based on the error it seems you are using a batch size of 32 on the model output and thus
weight contains only 15 values.
Hi @ptrblck, that problem has already been solved. May I know why it should be a loss.mean() before backward()? because I often add all of the loss for all of the batches in one epoch and then calculate the mean, not directly calculating the loss mean in one batch.
mean reduction would be the common approach, but you are of course free to reduce the loss according to your use case.
hi @bryan123 , boosting is a additive model for minimizing a loss function.
minimizing loss for CNN model is basically the same thing and it changes weight for each sample automatically.(lower loss for sample that are classified correctly )
read xgboost paper for more insight.
hi @mMagmer, could you show me how to implement weights for each sample in a piece of code to minimize loss for the CNN model?