Validation accuracy stuck at fixed number

I’have implemented the same network in Keras and I expect to get the exact same results in pytorch, but during training I see no progress. I suspect that my layers are not connected the way I expect them to be. When I check layer’s shape for layer in final.children() I only get one print which is [(1,1)]

class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(256, 1, 1)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(14*14, 1)

    def forward(self, x):
        output1 = F.sigmoid(self.conv1(x))
        output2 = F.sigmoid(self.fc1(output1))
        return output2

class Final(nn.Module):
    def __init__(self):
        super(Final, self).__init__()
        pretrained_model = torch.hub.load('pytorch/vision:v0.6.0', 'densenet201', pretrained=True)
        model = MyModel()
        list_of_layers = list(pretrained_model.features)[:8]
        self.final_model = nn.Sequential(*list_of_layers)

    def forward(self, x):
        outputs = list()
        for ii, model in enumerate(self.final_model):
            x = model(x)
            if ii == 8 or ii == 10:
        return outputs

list(model.children()) will only contain the child modules, i.e. [Conv2d(256, 1, kernel_size=(1, 1), stride=(1, 1)), Flatten(), Linear(in_features=196, out_features=1, bias=True)] and will miss the functional calls, which are defined in its forward method (F.sigmoid).
Wrapping models into an nn.Sequential container works, if you are just calling nn.Modules sequentially.

For your use case, you could probably append model to the final_model instead of its children.

Thanks for your reply that clears something up for me, but how can I get output shape for every layer to make sure model works properly.

The output shape of some layers depends on the input shape (e.g. for convolutions) and the output shape might be unknown before passing the real inputs.

For the sake of debugging you could add print statements into your forward method and check the shape of each activation. Alternatively, you could also manually calculate the output shapes, if the input shapes are known in advance, but the print approach is often simpler. :wink:

1 Like

Tnx for your response.