I need to ensemble the features only that help in classification which I can extract it from two different models . So I need to remove the last layer first from each model then start to concatenate them … how can the ensemble method will be ?
You could replace the last linear layers with nn.Identity
modules and create the ensemble as described in this post.
Thanks a lot . I will try it . But I shouldn’t write the line for new classifier
self.classifier = nn.Linear(2048+512, nb_classes)
and this too
x = self.classifier(F.relu(x))
Right ?
Yes, you could add a new classifier using the output features of both feature extractors.
Excuse me do you mean that I need to write those lines too . As I thought that I concatenate the features I got from both model only not to make a classifier from them …
You don’t need to use a classifier on top of it as it depends on your use case.
If you want to return the features only, don’t use the additional classifier. If you want to train another stage on top of the pretrained feature extractors, add the classifier.
class MyEnsemble(nn.Module):
def __init__(self, modelA, modelB):
super(MyEnsemble, self).__init__()
self.modelA = modelA
self.modelB = modelB
# Remove last linear layer
self.modelA.fc = nn.Identity()
self.modelB.fc = nn.Identity()
# Create new classifier
# self.classifier = nn.Linear(2048+512, nb_classes)
def forward(self, x):
x1 = self.modelA(x.clone()) # clone to make sure x is not changed by inplace methods
x1 = x1.view(x1.size(0), -1)
x2 = self.modelB(x)
x2 = x2.view(x2.size(0), -1)
x = torch.cat((x1, x2), dim=1)
# x = self.classifier(F.relu(x))
return x
# Create models and load state_dicts
modelA= mode1()
# Load state dicts
PATH1 = 'model1.pth'
modelA.load_state_dict(torch.load(PATH1))
modelB = torch.hub.load("model2")
model = MyEnsemble(modelA, modelB)
but got
TypeError: forward() takes 2 positional arguments but 3 were given
Your code works fine and I don’t know where the error is raised, as you are not calling the forward
method in the posted code snippet:
modelA = nn.Linear(1, 1)
modelB = nn.Linear(1, 1)
model = MyEnsemble(modelA, modelB)
x = torch.randn(1, 1)
out = model(x)
Thanks a lot for your time and helping … excuse me if i have two models one of them i should remove the last layer or classification layer and the other model need to implement it as those line
modelB = ResNet50()
modelB.layers.pop()
modelB = Model(inputs=model.inputs, outputs=model.layers[-2].output)
then the ensemble should be like that
class Identity(nn.Module):
def __init__(self):
super().__init__()
def forward(self, x):
return x
class MyEnsemble(nn.Module):
def __init__(self, modelA, modelB):
super(MyEnsemble, self).__init__()
self.modelA = modelA
self.modelB = modelB
# Remove last linear layer
self.modelA.fc = nn.Identity()
def forward(self, x):
x1 = self.modelA(x.clone()) # clone to make sure x is not changed by inplace methods
x1 = x1.view(x1.size(0), -1)
x2 = self.modelB(x)
x2 = x2.view(x2.size(0), -1)
x = torch.cat((x1, x2), dim=1)
return x
# Create models and load state_dicts
modelA= model1()
# Load state dicts
PATH1 = 'model1.pth'
modelA.load_state_dict(torch.load(PATH1))
modelB = ResNet50()
modelB.layers.pop()
modelB = Model(inputs=model.inputs, outputs=model.layers[-2].output)
model = MyEnsemble(modelA, modelB)
You should be able to use different architectures as long as the torch.cat
operation is able to concatenate the output tensors from both models. I don’t know how your new model would be interpreted as it’s getting features from one model and predictions from another, but technically it should work.
Are you seeing any issues using your code?
i got this error CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 5.80 GiB total capacity; 191.43 MiB already allocated; 16.56 MiB free; 202.00 MiB reserved in total by PyTorch)
This error points out that your GPU doesn’t have enough memory for the current use case and you would thus need to decrease the memory usage e.g. via lowering the batch size.
If a single sample already causes the OOM issue, it might be easier to call the models separately to reduce the memory usage and concatenate (and store) the features afterwards.
If you don’t want to train the ensemble, but only want to get the features, you should also wrap the code into with torch.no_grad()
to save additional memory.
Thanks a lot but how can I concatenate the files… I’m already getting the features for both in two separate files but don’t know how can I merge them together . how can I start, please?
After storing the features, you could torch.load
them and use torch.cat
in the same way as in your ensemble, which is currently running out of memory.
Thanks for your help. I saved the features in a pickle file using dump but got problem when I loaded it using torch.load
torch.load pickle runtime error: invalid magic number; corrupt file ?
Is there any other way to concatenate the pickles, please
Appreciate your time and help
The file might be corrupt, so try to save it again (and maybe load it directly afterwards to verify that you can indeed load the file properly).