Dear @Soumya_Kundu
Thank you for your answer.
Not quite. My main goal is to divide my model into a self.features
and a ‘self.classifier’ part, where self.features is the pretrained model architecture of inceptionV3 and the classifier is a combination of fc layers to further process the feature vector of inceptionV3.
My whole code looks like this:
import torch
import torch.nn as nn
from torchvision import models
class CustomModel(nn.Module):
def __init__(self, pretrained_model, num_classes):
super(CustomModel, self).__init__()
self.pretrained_model = pretrained_model
self.num_classes = num_classes
# Load the pretrained model
if self.pretrained_model.lower() == 'inceptionv3':
self.base_model = models.inception_v3(weights=models.Inception_V3_Weights.DEFAULT, aux_logits=True)
elif self.pretrained_model.lower() == 'resnet18':
self.base_model = models.resnet18(weights=models.ResNet18_Weights.DEFAULT)
elif self.pretrained_model.lower() == 'resnet34':
self.base_model = models.resnet34(weights=models.ResNet34_Weights.DEFAULT)
elif self.pretrained_model.lower() == 'resnet152':
self.base_model = models.resnet152(weights=models.ResNet152_Weights.DEFAULT)
# drop the last classification layer and get the number of input_features for the custom classifier
self.features = nn.Sequential(*list(self.base_model.children())[:-1])
num_in_features = self.base_model.fc.in_features
layer_length = [int(num_in_features/(2**idx)) for idx in range(4)]
# Add custom dense layers for classification
self.classifier = nn.Sequential(
nn.Flatten(),
nn.Linear(layer_length[0], layer_length[1]),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(layer_length[1], layer_length[2]),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(layer_length[2], layer_length[3]),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(layer_length[3], self.num_classes)
)
def forward(self, x):
x = self.features(x)
x = self.classifier(x)
return x
if __name__ == '__main__':
inc3 = CustomModel('inceptionV3', 2)
input_tensor = torch.randn(2, 3, 299, 299)
output = inc3(input_tensor)
The problem is if I try to run this code, the error mentioned above occurs, which says I give as an input to a convolutional layer in the inceptionV3 architecture a tensor of shape (2,1000).
The error lies within the inceptionV3 architecture, because if I print the shape after every layer I get the following:
output shape of layer 0: torch.Size([2, 32, 149, 149])
output shape of layer 1: torch.Size([2, 32, 147, 147])
output shape of layer 2: torch.Size([2, 64, 147, 147])
output shape of layer 3: torch.Size([2, 64, 73, 73])
output shape of layer 4: torch.Size([2, 80, 73, 73])
output shape of layer 5: torch.Size([2, 192, 71, 71])
output shape of layer 6: torch.Size([2, 192, 35, 35])
output shape of layer 7: torch.Size([2, 256, 35, 35])
output shape of layer 8: torch.Size([2, 288, 35, 35])
output shape of layer 9: torch.Size([2, 288, 35, 35])
output shape of layer 10: torch.Size([2, 768, 17, 17])
output shape of layer 11: torch.Size([2, 768, 17, 17])
output shape of layer 12: torch.Size([2, 768, 17, 17])
output shape of layer 13: torch.Size([2, 768, 17, 17])
output shape of layer 14: torch.Size([2, 768, 17, 17])
output shape of layer 15: torch.Size([2, 1000])
Followed by the error message:
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [2, 1000]
Does this clarify my problem?