Beginner questions about using PyTorch's AlexNet model from torch.hub

Here is sample code to use PyTorch’s built in AlexNet model for CIFAR-10 dataset classifier (10 classes):

model               = torch.hub.load("pytorch/vision", 
                                     weights = "DEFAULT")
model.classifier[1] = torch.nn.Linear(9216,4096)
model.classifier[4] = torch.nn.Linear(4096,1024)
model.classifier[6] = torch.nn.Linear(1024,10)

Why were these three Linear layers added? My understanding is that AlexNet
ended with two 4096 size linear layers and a 1000 size linear layer.

Also, are these Linear layers appended to AlexNet or do they modify existing AlexNet



I guess the author wanted to reinitialize the first linear layer, as it’s using the same config, and create a smaller feature space with only 10 outputs for the CIFAR10 dataset, but you might want to ask the author of this code about their intention.

The layers are modified and the original .classifier module is defined as:

model = torch.hub.load("pytorch/vision", 

# Sequential(
#   (0): Dropout(p=0.5, inplace=False)
#   (1): Linear(in_features=9216, out_features=4096, bias=True)
#   (2): ReLU(inplace=True)
#   (3): Dropout(p=0.5, inplace=False)
#   (4): Linear(in_features=4096, out_features=4096, bias=True)
#   (5): ReLU(inplace=True)
#   (6): Linear(in_features=4096, out_features=1000, bias=True)
# )


Thanks! It someone wanted to be less ambitious, and just append one new layer
to knock down the number of classes, would they replace those 3 Linear layers lines
with just this one modification?..

model.classifier[7] = torch.nn.Linear(1000,10)

From your print(model.classifier) output, it looks like this would be a clean way to just
tweak the number of AlexNet outputs without messing with AlexNet’s architecture no?



No, this won’t work as you will get an indexing error.
You could append layers to model.classifier instead.
However, note that appending a new classification layer after the original one is not a common approach as you would rather replace the last classification layer with your own one.
The reason to do so is to be able to use the features to train the model for your use case instead of using the logits for 1000 classes to train your 10 classes.

Thanks a lot. That makes sense.