Output of Sigmoid: last layer of CNN

Or_Rimoch · March 13, 2019, 3:15pm

This is a multi class supervised classification problem.
I’m using BCELoss() loss function with Sigmoid on the last layer.

Question:

Why do I get an “one-hot” vectors on inference time (When I load weights), instead of probabilities between 0-1 for every entry in every column vector?

Model:

# Our input x in shape (1, 257, 257)

def __init__(self):
        super(SimpleCNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 16, kernel_size=3, stride=2),    
            nn.ReLU(),                                     # (16, 128, 128)
            nn.Conv2d(16, 32, kernel_size=2, stride=2),
            nn.ReLU(),                                     # (32, 64, 64)
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.ReLU(),                                     # (32, 32, 32)
            nn.Conv2d(32, 64, kernel_size=2, stride=2),
            nn.ReLU(),                                     # (64, 16, 16) 
            nn.Conv2d(64, 128, kernel_size=2, stride=2),   
            nn.ReLU(),                                     # (128, 8, 8)
            nn.MaxPool2d(kernel_size=2, stride=2),     
            nn.ReLU()                                      # (128, 4, 4)
        )    
        
        self.classifier = nn.Sequential(
            nn.Linear(128*4*4, 120),                       
            nn.ReLU(),                                     # (16, 120)
            nn.Linear(120,64),                              
            nn.Sigmoid()                                   # (16, 4)
        )
        
def forward(self, x):
    out = self.layer1(x)                                   
    # print(out.shape)
    out = out.view(-1, 128*4*4)          
    # print(out.shape)                                     
    out = self.classifier(out)
    # print(out.shape)                                     # (32, 64)
    # print(out.view(-1, QUANTA, args.num_params))
    return out.view(-1, QUANTA, args.num_params)           # (16, 4)

How I load weights:

model = SimpleCNN()
model.load_state_dict(torch.load(PATH))
model.eval()

Predicted classes:

[[[1. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 1. 0. 1.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]]

Deeply · March 13, 2019, 3:50pm

What were your ground-truth labels?

Or_Rimoch · March 13, 2019, 7:14pm

The ground truth is the STFT( image) and it’s classes:
X: (1,257,257) ; y: (16,4) “one hot”

Deeply · March 14, 2019, 12:34pm

Let me get this straight, if you use the model after training before saving the weights, you get values between 0 and 1, but if you load the weights, you get deterministic 0's or 1's?

Or_Rimoch · March 14, 2019, 12:53pm

Exactly. I can’t figure why.
Any idea? I thought maybe to change loss function or maybe related to the commands: model.eval(), model.train().

imaluengo · March 14, 2019, 1:15pm

Can you write the code you use to save model and load it?

Deeply · March 14, 2019, 1:36pm

Remember that you must call model.eval() to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results. Source.

Or_Rimoch · March 14, 2019, 1:45pm

Sure!

Here is a function how I save the model / weights.

def save_models(epoch):

Remove the last checkpoint

path = os.path.join(RUNS_PATH, exp_name)

files = os.listdir(path)

for file in files:

if file.endswith(’.model’):

os.remove(os.path.join(path, file))

Save new checkpoint

torch.save(model.state_dict(), os.path.join(path, “CNN_model_epoch_{}.model”.format(str(epoch))))

print(“Chekcpoint saved”)

Note that I already showed how I load the model / weights.

Thanks,

Or

‫בתאריך יום ה׳, 14 במרץ 2019 ב-15:25 מאת ‪Imanol Luengo via PyTorch Forums‬‏ <‪noreply@discuss.pytorch.org‬‏>:‬