I’m trying to recreate a semi-supervised GAN architecture for MNIST-data in Pytorch that was originally implemented in Keras in this blogpost.
In the blogpost, there are three possibilites outlined to implement the semi-supervised discriminator. I’m struggling with this one (“Stacked Discriminator Models With Shared Weights”):
I’m redoing the model in Pytorch like this:
# Discriminator
class Discriminator(nn.Module):
def __init__(self, n_classes):
super(Discriminator, self).__init__()
# number of classes for the classifier
self.n_classes = n_classes
# layers the classifier model and discriminator model share
self.shared_layers = nn.Sequential(nn.Conv2d(in_channels=1, out_channels=128, kernel_size=3, stride=2, padding=15),
nn.LeakyReLU(negative_slope=0.2),
nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, stride=2, padding=15),
nn.LeakyReLU(negative_slope=0.2),
)
self.dropout = nn.Sequential(nn.Dropout(p=0.4))
# output layer nodes
self.fc = nn.Sequential(nn.Linear(128*28*28, self.n_classes))
def forward(self, x):
x = self.shared_layers(x)
# flatten
x = x.view(-1, 128*28*28)
x = self.dropout(x)
x = self.fc(x)
# classifier output
c_out = F.softmax(x, dim=1)
# discriminator output
d_out = self.custom_activation(x)
return d_out, c_out
# to reuse the classifier output before softmax for the discriminator output
def custom_activation(self, x):
logexpsum = torch.sum(torch.exp(x), dim=1)
result = logexpsum / (logexpsum + 1.0)
return result
However, when training this model, the classifier part of the model is really bad, with a training accuracy below chance (below 10%). Can anyone give me hint about what I got wrong?