Output of Sigmoid: last layer of CNN

This is a multi class supervised classification problem.
I’m using BCELoss() loss function with Sigmoid on the last layer.

Question:

Why do I get an “one-hot” vectors on inference time (When I load weights), instead of probabilities between 0-1 for every entry in every column vector?

Model:

# Our input x in shape (1, 257, 257)

def __init__(self):
        super(SimpleCNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=3, stride=2),    
            nn.ReLU(),                                     # (16, 128, 128)
            nn.Conv2d(16, 32, kernel_size=2, stride=2),
            nn.ReLU(),                                     # (32, 64, 64)
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),                                     # (32, 32, 32)
            nn.Conv2d(32, 64, kernel_size=2, stride=2),
            nn.ReLU(),                                     # (64, 16, 16) 
            nn.Conv2d(64, 128, kernel_size=2, stride=2),   
            nn.ReLU(),                                     # (128, 8, 8)
            nn.MaxPool2d(kernel_size=2, stride=2),     
            nn.ReLU()                                      # (128, 4, 4)
        )    
        
        self.classifier = nn.Sequential(
            nn.Linear(128*4*4, 120),                       
            nn.ReLU(),                                     # (16, 120)
            nn.Linear(120,64),                              
            nn.Sigmoid()                                   # (16, 4)
        )
        
def forward(self, x):
    out = self.layer1(x)                                   
    # print(out.shape)
    out = out.view(-1, 128*4*4)          
    # print(out.shape)                                     
    out = self.classifier(out)
    # print(out.shape)                                     # (32, 64)
    # print(out.view(-1, QUANTA, args.num_params))
    return out.view(-1, QUANTA, args.num_params)           # (16, 4)

How I load weights:

model = SimpleCNN()
model.load_state_dict(torch.load(PATH))
model.eval()

Predicted classes:

[[[1. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 1. 0. 1.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]]

What were your ground-truth labels?

The ground truth is the STFT( image) and it’s classes:
X: (1,257,257) ; y: (16,4) “one hot”

Let me get this straight, if you use the model after training before saving the weights, you get values between 0 and 1, but if you load the weights, you get deterministic 0's or 1's?

Exactly. I can’t figure why.
Any idea? I thought maybe to change loss function or maybe related to the commands: model.eval(), model.train().

Can you write the code you use to save model and load it?

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results. Source.

1 Like

Sure!

Here is a function how I save the model / weights.

def save_models(epoch):

Remove the last checkpoint

path = os.path.join(RUNS_PATH, exp_name)

files = os.listdir(path)

for file in files:

if file.endswith(’.model’):

os.remove(os.path.join(path, file))

Save new checkpoint

torch.save(model.state_dict(), os.path.join(path, “CNN_model_epoch_{}.model”.format(str(epoch))))

print(“Chekcpoint saved”)

Note that I already showed how I load the model / weights.

Thanks,

Or

‫בתאריך יום ה׳, 14 במרץ 2019 ב-15:25 מאת ‪Imanol Luengo via PyTorch Forums‬‏ <‪noreply@discuss.pytorch.org‬‏>:‬