efficientnet_b0 and efficientnet_b4 models but x takes out_features from linear layer of efficientnet instead of in_features
ptrblck this is amazing⌠A way to truncate the last layer off the sub-model which works well in my case. The other two ways I also learnt from you, so amazing all round, but I needed this way.
When I turn the sub-model into a nn.Sequential block,
self.modelA = torch.nn.Sequential(*list(modelA.children())[:-1])
then I lose critical information from the modelA forward pass where it restructures the data between layersâŚ
If I put a hook in,
self.modelA.FCblock[2].register_forward_hook(get_activation('1FC'))
x1 = self.modelA(x) ## x1 is unused, I pass
x1FC = activation['1FC']
...
x = torch.cat((x1FC, x2FC), dim=1)
then it the forward pass works but it breaks the gradients in my final ensemble model, and I am ultimately working to recreate the vanilla backpropagation explainability method to visualise the gradients on my input data⌠so i need the gradients to flow and the hook disrupts my gradients
But
self.modelA = modelA
self.modelA.fc = nn.Identity()
seems to be working a wonder (for now!), very creative
excuse me why did you write
2048+512
Because the used models:
modelA = models.resnet50(pretrained=True)
modelB = models.resnet18(pretrained=True)
output an activation tensor in the shape [batch_size, 2048]
and [batch_size, 512]
, respectively. You can of course just pass the actual sum to the in_features
, but I thought seeing these activation shapes separately would clarify how the concatenated feature shape is calculated.
Thanks for replying , so any two models i need to remove the classifier layer for both then concatenate them then adding the classifier layer . right ?
please does nb_classes
denote to the nubmer of classes that exist in my data ?
def __init__(self, modelA, modelB, nb_classes=10):
I wouldnât say you need to do it this way but itâs certainly one approach.
Yes.
Appreciate your answer. What are the alternatives approach that i may search on ?
Excuse me . Does that code do ensemble learning or removing the last layer from each model then concatenate features then adding a classifier not ensemble
This approach should come close to a stacking ensemble.
excuse me does this right ? Iâm confused because when I read it I though the code above used concatenate features not ensemble
Ensemble learning typically involves training multiple models independently and combining their predictions, whereas feature concatenation involves combining the extracted features from different models into a single feature vector before passing it through a classifier.
Use whatever works for you. My assumption is that predictions should be passed to a voting classifier, while features might be passed to a trainable classifier to further improve the performance.
However, your use case might differ, so use what works for you.