How to use custom model for classification?

I used MMdnn to convert the places205 GoogLeNet to PyTorch from Caffe, and upon testing I’m not sure how to get it to perform classification correctly.

The model file can be found here: https://github.com/ProGamerGov/pytorch-places205, and it’s approximately 40.9 MB. The last few layers are:

        inception_5b_5x5 = self.inception_5b_5x5(inception_5b_5x5_pad)
        inception_5b_relu_3x3 = F.relu(inception_5b_3x3)
        inception_5b_relu_5x5 = F.relu(inception_5b_5x5)
        inception_5b_output = torch.cat((inception_5b_relu_1x1, inception_5b_relu_3x3, inception_5b_relu_5x5, inception_5b_relu_pool_proj), 1)
        pool5_7x7_s1    = F.avg_pool2d(inception_5b_output, kernel_size=(7, 7), stride=(1, 1), padding=(0,), ceil_mode=False, count_include_pad=False)
        pool5_drop_7x7_s1 = F.dropout(input = pool5_7x7_s1, p = 0.4000000059604645, training = self.training, inplace = True)
        return pool5_drop_7x7_s1

I’m not quite sure how to get an output value that can be used to determine which of the 205 categories the input is?

Some example input sizes:

---------------------------------------------
torch.Size([1, 3, 512, 429]):
---------------------------------------------
inception_5b_5x5 torch.Size([1, 128, 16, 13])
inception_5b_relu_3x3 torch.Size([1, 384, 16, 13])
inception_5b_relu_5x5 torch.Size([1, 128, 16, 13])
inception_5b_output torch.Size([1, 1024, 16, 13])
pool5_7x7_s1 torch.Size([1, 1024, 10, 7])
pool5_drop_7x7_s1 torch.Size([1, 1024, 10, 7])

---------------------------------------------
torch.Size([1, 3, 720, 603]):
---------------------------------------------
inception_5b_5x5 torch.Size([1, 128, 22, 18])
inception_5b_relu_3x3 torch.Size([1, 384, 22, 18])
inception_5b_relu_5x5 torch.Size([1, 128, 22, 18])
inception_5b_output torch.Size([1, 1024, 22, 18])
pool5_7x7_s1 torch.Size([1, 1024, 16, 12])
pool5_drop_7x7_s1 torch.Size([1, 1024, 16, 12])

---------------------------------------------
torch.Size([1, 3, 1280, 1072]):
---------------------------------------------
inception_5b_5x5 torch.Size([1, 128, 40, 33])
inception_5b_relu_3x3 torch.Size([1, 384, 40, 33])
inception_5b_relu_5x5 torch.Size([1, 128, 40, 33])
inception_5b_output torch.Size([1, 1024, 40, 33])
pool5_7x7_s1 torch.Size([1, 1024, 34, 27])
pool5_drop_7x7_s1 torch.Size([1, 1024, 34, 27])
---------------------------------------------

Any help would be appreciated!

This seems to work? But I have no idea if it actually does work, or if I am doing something wrong?

cnn = GoogLeNetPlaces205()
cnn.load_state_dict(torch.load("googlenet_places205.pth"))
cnn.eval()

input = torch.randn(1, 3, 512, 384)
x = cnn(input)

test_avgpool = nn.AdaptiveAvgPool2d((7, 7))
x = test_avgpool(x)
x = torch.flatten(x, 1)
test_fc = nn.Linear(1024 * 7 * 7, 204)
x = test_fc(x)

The model seems to work correctly with DeepDream, so at least it’s put together correctly: