I have been working on some regression task using CNN. At the moment I’m testing some
architecture (Alexnet like) using Keras. Over only two epochs, one can already notice some correlation (still weak though) between the prediction and the actual value of the target. However, with Pytorch
I struggle to get some trend in the results even with ten epochs. I must admit I’m quite new in Pytorch and this is the very first time I’m using it. Please see below the Keras implementation:
x = Conv2D(16, (7, 7), activation='relu', padding='same', strides=2)(input_img)
x = MaxPooling2D((3, 3), padding='same', strides=2)(x)
x = BatchNormalization()(x)
x = Conv2D(32, (5, 5), activation='relu', padding='same', strides=2)(x)
x = MaxPooling2D((3, 3), padding='same', strides=2)(x)
x = BatchNormalization()(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same', strides=2)(x)
x = Conv2D(64, (3, 3), activation='relu', padding='same', strides=2)(x)
x = Conv2D(32, (3, 3), activation='relu', padding='same', strides=2)(x)
x = MaxPooling2D((3, 3), padding='same', strides=2)(x)
x = BatchNormalization()(x)
x = Flatten()(x)
x = Dense(512, activation='tanh')(x)
x = Dropout(0.5)(x)
x = Dense(512, activation='tanh')(x)
x = Dropout(0.5)(x)
x = Dense(9, activation=None)(x)
and Pytorch:
class AlexNetTest(nn.Module):
def __init__(self):
super(AlexNetTest, self).__init__()
self.C1 = nn.Conv2d(1, 8, 7, stride = 2, padding = 3)
torch.nn.init.xavier_uniform_(self.C1.weight)
torch.nn.init.constant_(self.C1.bias, 0)
self.S2 = nn.MaxPool2d(3, stride = 2, padding = 1) # first pooling
self.BC1S2 = nn.BatchNorm2d(8, momentum = 0.99, eps = 1.0e-3)
self.C3 = nn.Conv2d(8, 16, 5, stride = 2, padding = 2)
torch.nn.init.xavier_uniform_(self.C3.weight)
torch.nn.init.constant_(self.C3.bias, 0)
self.S4 = nn.MaxPool2d(3, stride = 2, padding = 1) # second pooling
self.BC3S4 = nn.BatchNorm2d(16, momentum = 0.99, eps = 1.0e-3)
self.C5 = nn.Conv2d(16, 32, 3, stride = 2, padding = 1)
torch.nn.init.xavier_uniform_(self.C5.weight)
torch.nn.init.constant_(self.C5.bias, 0)
self.C6 = nn.Conv2d(32, 32, 3, stride = 2, padding = 1)
torch.nn.init.xavier_uniform_(self.C6.weight)
torch.nn.init.constant_(self.C6.bias, 0)
self.C7 = nn.Conv2d(32, 16, 3, stride = 2, padding = 1)
torch.nn.init.xavier_uniform_(self.C7.weight)
torch.nn.init.constant_(self.C7.bias, 0)
self.S8 = nn.MaxPool2d(3, stride = 2, padding = 1) # third pooling
self.BC7S8 = nn.BatchNorm2d(16, momentum = 0.99, eps = 1.0e-3)
self.F8 = nn.Linear(16 * 1 * 1, 512)
torch.nn.init.xavier_uniform_(self.F8.weight)
torch.nn.init.constant_(self.F8.bias, 0)
self.F9 = nn.Linear(512, 512)
torch.nn.init.xavier_uniform_(self.F9.weight)
torch.nn.init.constant_(self.F9.bias, 0)
self.Out = nn.Linear(512, 9)
torch.nn.init.xavier_uniform_(self.Out.weight)
torch.nn.init.constant_(self.Out.bias, 0)
def forward(self, x):
x = F.relu(self.BC1S2(self.C1(x)))
x = self.S2(x)
x = F.relu(self.BC3S4(self.C3(x)))
x = self.S4(x)
x = F.relu(self.C5(x))
x = F.relu(self.C6(x))
x = F.relu(self.BC7S8(self.C7(x)))
x = self.S8(x)
x = x.view(-1, 16 * 1 * 1)
x = F.dropout(torch.tanh(self.F8(x)), p = 0.5, training = True)
x = F.dropout(torch.tanh(self.F9(x)), p = 0.5, training = True)
x = self.Out(x)
return x
In both cases, I use the same optimizer (RMSprop(lr = 0.001) and torch.optim.RMSprop(net.parameters(), lr = 0.001, alpha = 0.9)). I wonder
if you can help spot the issue with my Pytorch implementation.
Thanks a lot in advance.