I am trying to replicate in PyTorch the behavior of a simple FFNN created with Keras.
This is the Keras version:
initializer = keras.initializers.random_uniform(seed=1)
model = Sequential([ Dense(512, activation="relu", input_shape=input_shape, kernel_initializer=initializer), Dense(512, activation="relu", kernel_initializer=initializer), Dense(num_output, activation="softmax") ]) model.compile( optimizer=keras.optimizers.Adam(learning_rate=0.01), loss="categorical_crossentropy", # TODO: change metric metrics=[keras.metrics.AUC()] )
and the model is trained as follows:
self.model.fit(self.X, self.y, batch_size=batch_size, epochs=epochs, verbose=verbose)
Now, my PyTorch implementation (it’s my first pytorch NN) is the following:
class FFNN(nn.Module):
def init(self, input_shape, num_classes):super(FFNN, self).__init__() self.net = nn.Sequential( nn.Linear(input_shape[0], 512), nn.ReLU(), nn.Linear(512, 512), nn.ReLU(), nn.Linear(512, num_classes), nn.Softmax(dim=1)) self.net.apply(self.init_weights) def forward(self, X): return self.net(X) def init_weights(self, m): if type(m) == nn.Linear: m.weight.data.uniform_(-0.05, 0.05)
and it is trained as follows:
y_idx = get_num_from_1hot(y) # convert to tensor X_tr = Variable(torch.from_numpy(X).float(), requires_grad=False) y_tr = Variable(torch.from_numpy(y_idx), requires_grad=False) # Loss and Optimizer optimizer = torch.optim.Adam(self.model.parameters(), lr=0.01) loss_func = torch.nn.CrossEntropyLoss() for i in range(epochs): y_pred = self.model(X_tr) loss = loss_func(y_pred, y_tr) optimizer.zero_grad() loss.backward() optimizer.step()
Unfortunately the Pytorch version seems to get stuck in the training (test accuracy stops at a very specific value for MANY epochs – sometimes until the end of the training).
Can you spot any problem with my implementation?