I am trying to replicate in PyTorch the behavior of a simple FFNN created with Keras.

This is the Keras version:

initializer = keras.initializers.random_uniform(seed=1)

`model = Sequential([ Dense(512, activation="relu", input_shape=input_shape, kernel_initializer=initializer), Dense(512, activation="relu", kernel_initializer=initializer), Dense(num_output, activation="softmax") ]) model.compile( optimizer=keras.optimizers.Adam(learning_rate=0.01), loss="categorical_crossentropy", # TODO: change metric metrics=[keras.metrics.AUC()] )`

and the model is trained as follows:

self.model.fit(self.X, self.y, batch_size=batch_size, epochs=epochs, verbose=verbose)

Now, my PyTorch implementation (it’s my first pytorch NN) is the following:

class FFNN(nn.Module):

definit(self, input_shape, num_classes):`super(FFNN, self).__init__() self.net = nn.Sequential( nn.Linear(input_shape[0], 512), nn.ReLU(), nn.Linear(512, 512), nn.ReLU(), nn.Linear(512, num_classes), nn.Softmax(dim=1)) self.net.apply(self.init_weights) def forward(self, X): return self.net(X) def init_weights(self, m): if type(m) == nn.Linear: m.weight.data.uniform_(-0.05, 0.05)`

and it is trained as follows:

`y_idx = get_num_from_1hot(y) # convert to tensor X_tr = Variable(torch.from_numpy(X).float(), requires_grad=False) y_tr = Variable(torch.from_numpy(y_idx), requires_grad=False) # Loss and Optimizer optimizer = torch.optim.Adam(self.model.parameters(), lr=0.01) loss_func = torch.nn.CrossEntropyLoss() for i in range(epochs): y_pred = self.model(X_tr) loss = loss_func(y_pred, y_tr) optimizer.zero_grad() loss.backward() optimizer.step()`

Unfortunately the Pytorch version seems to get stuck in the training (test accuracy stops at a very specific value for MANY epochs – sometimes until the end of the training).

Can you spot any problem with my implementation?