Output of nn.Sigmoid() layer is not between 0 and 1

shrbrh · January 12, 2023, 6:30am

I am training a simple GAN model where the discriminator has to classify between real and generated images. The discriminator model looks like this:

class Discriminator(nn.Module):
    def __init__(self, image_size=128,  conv_dim=64,  c_dim=5,  repeat_num=6):
        super(Discriminator, self).__init__()
        layers = []
        layers.append(nn.Conv2d(3, conv_dim, kernel_size=4, stride=2, padding=1))
        layers.append(nn.LeakyReLU(0.01))

        curr_dim = conv_dim
        for i in range(1, repeat_num):
            layers.append(nn.Conv2d(curr_dim, curr_dim*2, kernel_size=4, stride=2, padding=1))
            layers.append(nn.LeakyReLU(0.01))
            curr_dim = curr_dim * 2
        
        kernel_size = int(image_size / np.power(2, repeat_num))
        
        self.main = nn.Sequential(*layers)
        self.conv1 = (nn.Conv2d(curr_dim, 1, kernel_size=3, stride=1, padding=1, bias=False))
        self.sig= (nn.Sigmoid())
        self.conv2 = (nn.Conv2d(curr_dim, c_dim, kernel_size=kernel_size, bias=False))
        self.soft = (nn.Softmax())

The Sigmoid layer should have a single output corresponding to whether and image is real or fake. The Softmax layer should output the probabilities for all the 5 classes the fake image might belong to.
I am getting values ranging from 258.514 to 3.999 as an output for the Sigmoid layer.
Shouldn’t this layer constrain the values between 0 and 1? Am I using the layers in the right manner?

ptrblck · January 12, 2023, 6:35am

Could you post a minimal and executable code snippet showing this unexpected behavior, please?

ptrblck · January 12, 2023, 8:15am

Your code is unfortunately still not executable and I cannot reproduce the issue using:

model = Discriminator()
x = torch.randn(2, 3, 128, 128)
out = model(x)
print(out[0].min(), out[0].max())
# tensor(0.4997, grad_fn=<MinBackward1>) tensor(0.5007, grad_fn=<MaxBackward1>)

shrbrh · January 12, 2023, 10:43am

When I try to reproduce the issue by randomly initializing a tensor I get values between 0 and 1.

tensor(0.4993, grad_fn=<MinBackward1>) tensor(0.5003, grad_fn=<MaxBackward1>)

But when I run the code with actual images, this is what I get:

tensor([[[[52.6691, 52.6701, 52.6636, 52.6650, 52.6738]

Could the issue be with the images then?

ptrblck · January 12, 2023, 5:05pm

I doubt it, but would need an executable code snippet to be able to reproduce and debug the issue.

J_Johnson · January 13, 2023, 2:20am

Perhaps your second Conv2d after the Sigmoid is getting kernels outside of 0 and 1 during training. Maybe try moving your Sigmoid after that layer and see what happens.

shrbrh · January 13, 2023, 6:09am

Thanks for your suggestion.
But I am feeding the output of self.conv1 into the Sigmoid layer and the output of self.conv2 into the Softmax layer. Will the second Conv2d still affect the output of the Sigmoid layer?
Anyways, I made the change and tried but its still not working as expected.

J_Johnson · January 13, 2023, 9:00am

On closer inspection, I see what you’re doing in the forward pass.

Perhaps try cloning h and assign to another letter(i.e. r=h.clone()) before sending it through further layers. Then use r in your other branch.

It’s hard to say exactly what the issue is without seeing your training process.